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(§) Apparatus and method for iempel ziv data compression with management of multiple dictionaries in 
content addressable memory. 

@ A class of lossless data compression 
algorithms use a memory-based dictionary 
(312) of finite size to facilitate the compression 
and decompression of data. To reduce the loss 
in data compression caused by dictionary re- 
sets, a standby dictionary (328) is used to store 
a subset of encoded data entries previously 
stored in a current dictionary. In a second 
aspect of the invention, data is compres- 
sed/decompressed according to the address 
location of data entries contained within a dic- 
tionary built in a content addressable memory 
(CAM) (312). In a third aspect of the invention, 
the minimum memory/high compression 
capacity of the standby dictionary scheme is 
combined with the fast single-cycle per charac- 
ter encoding/decoding capacity of the CAM 
circuit. In a fourth aspect of the invention, a 
selective overwrite dictionary swapping tech- 
nique is used to allow all data entries to be used 
at all times for encoding character strings 
(450-472). 
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This application is a continuati n in part of copending U.S. application Sen No. 07/996,808 fil dDec mb r 
23, 1992. 

BACKGROUND OF THE INVENTION 

This invention relates generally to data compression and decompression methods and apparatus and 
more particularly to implementations of lossless data compression algorithms whicrruse-a dictionary to stor 
compression and decompression information. 

A major class of compression schemes encodes multiple-character strings using binary sequences or -co- 
dewords' not otherwise used to encode individual characters. The strings are composed of an "alphabet - or 
single-character strings. This alphabet represents the smallest unique piece of information the compressor 
processes. Thus, an algorithm which uses eight bits to represent its characters has 256 unique characters in 
its alphabet Compression is effective to the degree that the multiple-character strings represented in the en- 
coding scheme are encountered in a given file of data stream. By analogy with bilingual dictionaries used to 
translate between human languages, the device that embodies the mapping between uncompressed code and 
compressed code is commonly referred to as a "dictionary." 

Generally, the usefulness of a dictionary-based compression scheme is dependent on the frequency with 
which the dictionary entries for multiple-character strings are used. If a fixed dictionary is optimized for one 
file type it is unlikely to be optimized for another. For example, a dictionary which includes a large number of 
character combinations likely to be found in newspaper text files is unlikely to compress efficiently data base 
files, spreadsheet files, bit-mapped graphics files, computer-aided design files, et cetera. 

Adaptive compression schemes are known in which the dictionary used to compress given input data is 
developed while that input data is being compressed. Codewords representing every single character possible 
in the uncompressed input data are put into the dictionary. Additional entries are added to the dictionary as 
multiple-character strings are encountered in the file. The additional dictionary entries are used to encode sub- 
sequent occurrences of the multiple-character strings. For example, matching of current input patterns is at- 
tempted only against phrases currently residing in the dictionary. After each failed match, a new phrase is add- 
ed to the dictionary. The new phrase is formed by extending the matched phrase by one symbol (e.g . the input 
symbol that "breaks" the match). Compression is effected to the extent that the multiple-character strings oc- 
curring most frequently in the file are encountered as the dictionary is developing. 

During decompression, the dictionary is built in a like manner. Thus, when a codeword for a character string 
is encountered in the compressed file, the dictionary contains the necessary information to reconstruct the 
corresponding character string. Widely-used compression algorithms that use a dictionary to store compres- 
sion and decompression information are the first and second methods of Lempel and Ziv. called LZ1 and LZ2 
respectively. These methods are disclosed in U.S. Patent No. 4.464.650 to Eastman et al. f and various Im- 
provements in the algorithms are disclosed in U.S. Patent Nos. 4,558,302 to Welch, and 4.814.746 to Miller 
et al. These references further explain the use of dictionaries. 

When working on a practical implementation, the amount of memory available for compression/decom- 
pression is finite. Therefore, the number of entries in the dictionary is finite and the length of the codewords 
used to encode the entries is bounded. Typically, the length varies between 12 and 16 bits. When the Input 
data sequence is sufficiently long, the dictionary will eventually "fill up." Several courses of action are possible 
at this point. For example, the dictionary can be frozen in its current state, and used for the remainder of the 
input sequence. In a second approach, the dictionary is reset and a new dictionary created from scratch. In a 
third approach, the dictionary is frozen for some time, until the compression ratio deteriorates, then the dic- 
tionary is reset 

The first alternative has the disadvantage of losing the learning capability of the basic compression algo- 
rithm. If the statistics of the input data change, the dictionary no longer follows those changes, and a rapid 
deterioration in compression ratio will occur. 

A dicti nary reset method maintains the learning capability of the algorithm, but suff rs from a temporary 
deterioration in compression ratio when switched to an empty dictionary (e.g.. all previously accumulat d 
knowledge of the source is lost). For example, upon reset, all entries of the dictionary are indiscriminately dis- 
abled. Therefore, r cently obtained dictionary entri s, that would likely be utilized in further data compr ssion 
are lost along with older data entries that have a lower probability of further assisting in the compression and 
decompression process. Since all data ntries are lost during a dictionary reset the compression ratio is likely 
to t mporanly deteriorate. Thus, the compression efficiency is less than optimal. 

One method for increasing the efficiency of dictionary based data compression is discussed by Bunton 
and Borriello in PRACTICAL DICTIONARY MANAGEMENT FOR HARDWARE DATA COMPRESSION Com- 
munications of the ACM, January 1992. Vol 35, No. 1. Entir dictionary resets are avoided bv reolacinc one 
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fnouTn^I? f t^L St reCen " y US6d (LRU) C ° de b SeleCted and men ^written with the next 
mput character stnng. The Bunton. eL al. method improves the compression ratio but has the disadvantage 
of requmng a large number of additional bits for each dictionary entry to identify LRU status. Additional bits 
for each dictionary entry result in significantly increased hardware costs 

*J%Z£^ redUCin9 u the number of rei » uircd dic «°™ y *>sets is to increase the dictionary memory 
size. Increased memory size, however, increases cost and can increase the time required to search dictionary 
data entnes. In addUKKi. present LRU tracking methads become less practical with increased memory size 

Another bottleneck*) compression/decompression performance is the amount of time required to search 
the dictionary for previously encountered character strings. Traditionally, hashing algorithms are used to 
search for prev,ously-_stored dictionary entries and to locate avai.able memory locations for new character 
stnng* Typical arrangements use a RAM memory with two to four storage locations for each dictionary entry 
as disclosed in U.S. Patent No. 4.558.302 to Welch (LZW). 

The hashing algorithm maps each unique dictionary entry into the RAM space at an address based on 

HZ IT^r f. C f i0n ° f date WOfd C ° ntentS - SinCe 84,0,1 an a, 9° rithm ^ word or 

f ,e ds w,th.n the word to calculate the mapping address, more than one data word might map to the same lo- 
cation in memory causing a hashing collision. In this case, an alternative location must be found for the data 
lnev.te.bly. as the RAM locations fill up. a second dictionary entry will hash to a previously-used location This 
srtuation must be resolved before compression can continue. Hashing circuitry and. specifically, hashing col- 
20 throughput C °" S We am ^y to the compression/decompression system logic, and reduce system 

Typically, the dictionary based upon the data being compressed will be a small subset of all possible data 
entnes. Therefore, one method for reducing hashing collisions is to increase the number of dictionary storage 
locations. This approach, however, increases system compfexity and cost and prohibits integrating the memory 
with the -compression/decompression control logic. In addition, alargermemory could increase the search time 
required to determine if a character string has previously been loaded into memory 

Another bottleneck to data compression/decompression. is the amount of time and circuit complexity re- 
quired to encode and decode data character strings. For example, during data compression, after a character 
stnng ,s found not to match any of the data phrases previously stored within memory, it must be stored in an 
unoccupied data memory location. A codeword must be generated that uniquely identifies the stored character 
stnng and subphrases within a character string that previously matched dictionary data entries. The codeword 
eraJorJ" " * ^ ^ COmbined W ' th aM *<™* characters during further data compression op- 

During data decompression, a compressed data codeword may representan uncompressed data character 
and ^ additional codeword, for example, a link to the rest of the uncompressed data string, as described in 
Hewlett-Packard Journal, June 1989. pp. 27-31. The described HP-DC scheme encodes codewords sequen- 
t.a ly and stores the codewords (OMEGA) concatenated with a next byte (K) at dictionary address locations 
determined by a compressed code. Therefore, the dictionary must be read several times before the actual de- 
compressed data string is generated. Since the compressing and decompressing process is iterative, any ad- 
ditional clock cycles, other than the dock cycles used for dictionary access, significantly increase overall com- 
pression and decompression time. Present encoding, decoding, and dictionary search methods, however, re- 
quire more than one clock cycle to compress or decompress each input character. In addition, these encoding 
and decoding algorithms require complex compression and decompression hardware 

Accordingly, there is a need for improving the performance of dictionary-based data compression systems 
« system mPm ™ 9 thS enC ° ding a " d d8COdin9 of data in a dictionary-based data compression/decompression 

SUMMARY OF THE INVENTION 

It is. therefore, an object of the invention to minimize the loss in data compression created when the dic- 
tionary in a dictionary-based data compression system is reset 

A second object of the invention is to increase the adaptation properties of data compression systems for 
input data sequences with changing statistical characteristics. 

Another object of the invention is to reduce the amount of time required to encode/decode a character 
stnng in a dictionary-based data compression/decompression system. 

Another object of the invention is to maximize data compression capacity in a dictionary-based data com- 
pression/decompression system with a minimal amount of memory. 

An additional object of the invention is to minimize the amount of hardware and time required to selectively 
update a dictionary-based data compression/decompression system. 
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' One aspect of the inventions a data compression/decompression system that simultaneously builds a cur- 
1 * rent dictionary and a standby dictionary. The current dictionary serves the same purpose as trie dictionary in 
a standard data compression engine. The standby dictionary is built in parallel with the current dictionary, so 
as to contain a subset of the phrases of the current dictionary. This subset is chosen to best characterize the 
5 patterns occurring in the source data. When me current dictionary fills-up, it is replaced by the standby dic- 
tionary, and a new standby dictionary is built ^rbrri i scratch" as the new current dictionary continues to be built 
and used for compression; Therefore, the compressor never switches to an empty dictionary, and the deteri- 
oration in data compression caused by having a limited drtionary mernory size is reduced. 

The current dictionary starts with sufficient empty space to add new data entries thereby allowing con- 
to tinued adaptation to the source data. This feature isof paramount importance in compressing source data with 
varyinjg statistics. Altrwugh some information is lost by switching to a smaller number <rf data entries in the 
standby dictionary; the time to rebuild the dictionary to maximum eff idericy is still less than a complete dic- 
tionary reset Therefore, a smaller dictionary memor y can be used with less negative impact oh the data com- 
pression ratio. V :'*'v."3 ■ -. • ' :/ ■ > ,. : ' ■ 
15 ' ■ The criteria for selecti^ 

■*' depending upon the specific application. For example; an encoded data string is copied to the'standby dic- 
r ' tioriary if it has been matched at least once with a data entry in the current dictionary. Alternatively, the entries 
v v - iri the standby dictionary can be selected according tb string length, most recent data entry matches, or any 
criterion that identifies entries that maximize compression in a given application. 
7 2d " " ; 'In addition, the criteria for switching (resettingyf rbm the current to standby dictiohary can be changed de- 

■ pending on the type of data or application: For exam pie, the current diet ibriar y can be reset when it is f il led 
'■■ ■•• with Valid data entries; lh : the alternative^ the current dictionary can be reset when using it for compression 

? falls below a predetermined performance threshold; as described in U.S.'Pal 4,847.61916 Kafo et al. 

; '-lh a second application^ the standby dicSionaryi 1 mainly in srtuation data characteristics are 

25 stationary, the compressor makes two passes at the data. In the first pass, the compressor scans a large sam- 
' p!e of the datay The sampleis large enough to cause the current dictionary to fill up many times, thereby causing 
,u\ . standby dictionary to rei^ace th^e current diction proportional number of times. At each dictionary 
switch, the current dictibna^ iterations, the algorithm has built a dictionary 

strongly customized to the data sample, the customized dictionary is then set as the sole dictionary reference 
30 used by the compression engine during a second pass to compress the input data. The customized dictionary 
- thereby performs sighiffcantly better than a -single d^ 

■ A second aspect of the invention is a dictionary-based compression/decompression system architecture 
- arid method which utilizes the address values of stored data entries in the dictionary of a corrtpression/decorn- 

■' ~ pressibn system to simplify encoding as well as decoding circuitry. The system preferably uses a content ad- 
: - 35 > dressable merriory (CAM) with additional iGngic circuitry including local feedback circuitry to provide special 
- . : ; functions that speed up memc^yaccessa'nd si m£l if y external cdmpressibn/decompressbn logic. The memory 
: structure has unique features/ that can provide lossless data 'compression or decompression at a sustained 
rate of one character per clock cycle without hashing or potential for hashing collisions! 
: Specif icalry, the system preferably comprises ah associative memory that encodes character strings ac- 
! '4b '■• cording tb trie address Ibcbtibrfs of data bntries contained' within the niem An input character string com- 
, . birtation which has riot prevtously data stream is stored as a new data entry within 

1 ' • the dictionary. The CAM is organized into \vords* which each store a uniaue character string data entry. The 
memor y performs an assodativb parallel search with an input character string with selected bits in a "word," 
' r [' :: '"' on ail wbrds previously stored in the dictionary. In the event of a match; a match line associated with the data 
45 entry is activated. All the match lines bre then encoded into a single codeword representing the character string, 
the codeword is then combined with the next* input character and again compared with the data entries pre- 
viously stored in memory; Thus, character strings are assigned codewords according to their address locations 
in rnenfuwy. WhenVse^^ 

■ " string (e.g., its address) is b^put and bnbther search is started with a new character string starting with the 
T so character (K) that caused the rriateh to fail, the compressed data character (codeword) is a pointer to a data 
entry in the dictionary. Therefore, character strings are decoded by using the compressed data character as 
an address into the decompression dictionary. For example, initially, an external compressed character is used 
as an address into the dictionary. The data entry at the decoded address location is then read. If the data entry 
output from memory does not require further decompression (e.g., the memory output is the "root" of a linked 
55 list) then the data entry is output If the data entry contains another codeword (e.g., a further encoded link to 
anoth r dictionary address locati n), then the charact r at that address is output and the cod word at that ad- 
dress is fed back tb memory as the next dictionary ad ^\ * 

An internal address generator is us d for both compression and decompression arid resets coincident with 
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: ; a,me.mqry { reset Any write to the memory^an explicit writ or a failed match) will result in the/address incre- 
menting to the nextaddress. Incrementing need not be sequential, but may be, for example, pseudorandom, 
as long as both compressor and decompressor address generators are, initiated to,the, sajme state and incn> 
merit in the same way, with the result that both compre ssjpn and decompression dictionaries wiB be identical. 
^Tfii%logic : eliminates the need for generating/storing addresses in external control logic, and ean result in im- 
proved ,<»rnpressipndecompression performance (e.g., fewer clock cycles and faster data compression). 

: .To further reduce the time required for data accession, special update circuitry allc^a^empry search 
and a data write to be performed during tjie same dockxyde^. When a. character string is compared with the 
data entries within memory, a failedsearch requires tfie string tp;bestored^ a new data entry. The next avail- 
able address location is already known frqm the address genera tor and the character. string is already residing 
at the memory data input Therefore, contrpRogic can be used;tp auipmatica|ly,write the character string into 
memory if np^natch occurs during the search; the memory 
search clock cycle, (f a match Mound during the ; sparc^^^ prevents the character 

string from being loaded into memory as a valid data entry. - t . ; 

Jhe system and methp^. svmma^ system 
for fast cpmpression and decompression, of data. |tcan te^lem^ purpose com- 

puter or in^ardwareusing. customer semjetjs^o^ method can be used 

to implement storage/retrieyalqf linked I Jistdatafst ijLcan fce/eadiiy adapted to .various adaptive 

dictionary-based encpdprs : .- :; . :> .^ -.\ _7-y:7 ^ 7 7!. 7. 7 77 7777'^7 ■ 7777777* 7*,7 

T he *^^l^ 
dictionary scheme with thefjast singlercyde pej^ra^^^ 

circuit uses multipje didionaries w receives com- 

pressed andunepmpressed^a data entriesjntp I prie ofthe^dictionaries. Co- 

deword sxepresep^^^ 
dataenlry f thatrnatcliesth^ 

Tp support multiple dictionaries, pach memory Jocatton^^^ 
The data f ield stores o*ata ; entries ^nd the^tu^ that data entry. 

During a search operation, the circuit;cap.mask ? certain bits.of jjpth triestatus field, an d^the data field. This 
allows the system to determine which jdjctipnaryjis.assigned to.adata ejgUy anp:to,te memory 
locations are not currently assign ed t to,a : dictionary,: .i^. . . no v I. ■ 7 7^7 7. 7 C J-,,, 

Dictionary assignments, for eact^ da^^ ^changing the state^pf th£ compres- 

sion/decompression drcujtBy changing the circuit state,; at least one dictionary is reset. This allows the storage 
locations previously assigned to th^tdictipnary.tp npw constitute free storage locations no longer; assigned to 
■ a °y dictionary. These free storagp. locations ^are^ sj;riogs. The state 

changes can be triggered by djffer^ the adaptefeiiity of the 

system to different types, pf data:. v Fpr. exam 

change states when one of the dictionaries becomes (ull or alternatively, change .states when *hp compression 
ratio falls below a predetermine^ pe^^ iu^^:--- ^m: W- \-. •■.*<;: -77/ 77 

To further increase the compresston, ratio pftae compre^^ system, ; a second Lempel- 

ZJy cc^pressjpn/deepmpression system (LZSD2) is utilized taselectiye^ data entries with 

new Pf? 3 ^^ ?WP0?- ^© V?§P?iPH?nty system anpwsjhe use pfa H string matching 

at all, times, but stiJI uses the above described Standby Dictionary, methodology. Therefore, only two bits are 
needed to identify. Jhg, next overwrite location, reg of ^ dictionary size! Dictionaries are then capable of 
being updated withput negatiyely. affecting the data compression rate since each data.entry remains assigned 
to a dictionary ^ be performed with the : same compressipn/de- 

cpmpression fta^ware.asdescpb^^^ wjthput .negatively impacting the data compression rate. 

To provide ^single, clock cycle search, capability, Jhe co^^ constructs a 

standby dictionary in paralle dictionaries at the same time. 

The foregoing and Other the invention wilt become more readily ap- 
parent from the following detailed description, of a preferred embodiment which proceeds with reference to the 
drawings. . ........ l i: . , : V- 7 77777*" 7 7- " '"''7.' 

BRIEF DESCRIPTION OF f^E DF^VVINGS 7^7 7 

FIG. :1 is a data flow diagram for a data compression system with current and standby dictionaries in ao- 
cordanc with the invention. , . , , . 7 

FIG. 2 is a detailed data flow diagram illustrating one example for the standby dictionary data selection 
process of FIG. 1. - r . : ,. r ,. 7 • > ■ 
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FIG. 3 is a block diagram of an example of a data, compression circuitry implementing current and standby 
dictionaries according to the invention. ..'.*. r . /' 

FIG. 4 is a high level block diagram showing a data cirnpression/decompression system embodying the 
. present invention. . _ . J /' ! 

FIG. 5 is a detailed block diagram of the ..ir^pry and^ circuitry of FIG. 4. 

FIG. 6 is a logic diagram of the auto-update circuitry within the address decoder of FIg! 5^ 
FIG. 7 is a generalized data flow diagram for the>nethod cf date compressfo us ing a con- 

tent addressable memory (CAM) according to the invention., 

FIG. 8 is a detailed data flow diagram for the data compression procedure of FIG. 7. ' " 
^ G r$ I? % tfetei W-date flow, diagram for the data decompression procedure of FIG! 7. . ' ! 
Fl( *- 1 9 M: a ^^P^'9 al ^R^?? 1 ? f cpmpi^sipirj an^ decompression, procedures in FIGS. 8 and 9. 
^ FIG.11 is a^lockdia^ multVdictionary : compression/decompres- 
, sion system a 9P9fdi.ng. 

- 1? sh W.V? e ^ CAM shown in FIG. 11. 

FIG. 13 shows the dictionary values fwe^ state in the ST field of FIG. 11. 

^ r ; 1.4 illustrates the.state transition changWqrJhe CAM multVdictionary compression/decompression 
system. .,, , JA . . . . , . ti ^. , , t ,. \ " ^ J'/' " ■ ' •> '■ ■ 

:\ R 9- 15 is alogic diagram illustrating a simple haraSware I implementetion for changing compressor/ decom- 

, ..pressor states. ; .., - .. ......... . _ , iT ,..:,=....„.,..._'.". ' t [' " •' " '"' 

F !?. ; ia is a detailed circuit diagram of the main, comfrane 
compressionsystemshowninFIG.il. : , .. . 

by dictionary. ■ ... „ ; .'..//..*.'.'.": " ; 1 ■ 

: f^ 1 ?^^^ with a 

standby dictionary. . . . / [' v \i 'V ; ' -;*•■■ 

:•':■-■£!?* ?3 '? a 9raphical depiction of the to 19. 
.•i ^9.,?1 is a graph showing 

. compression scheme. , , . . ' ;\'.' 

; c F !9$; 2 2A r 22E are graphical ^epfctions of aseW (LZSD2) compression 

method. . • 

r.;R0- : 23 is a data flow diagrarn showing th^ general method for performing LZSD^ compression, 
v" J^9?: 24 A* ^nd ?4Q are a d L etei}^ L d^ta f low diagrainn for the procedure shown in FIG. 23. 
: / FJ R; 3? - is a data fiow ; diagram\showing ^ general method for a LZSD2 decompression method. 
- fi $?t?6A, 26B and 26C are a detailed data f low diagrarn for me procedure shown in FIG. 25. 

DE^LED.pE^RiPT^QN ]•[•■/'■''.• V : T* ; . ; ^ - ^ ^ 

*- ».■ i In description,. the first ang>e^ the standby dictionary and 

content addressable memory aspects-^o^ 

of the f irsttwo aspects of the invention. The fourth section describes an alternative method' of operation using 
the system described Jn the third section. ■-. f - . ! [ ' . 

I. Data Compression/Decompression System Using A Standby Dictionary 

. .. Fl ^r 1 is a data f i° w diagram for a data compres^ current and standby dic- 

tionaries. The. method illustrated in FIG. 1 begins at block 8 with initialization of both the current dictionary 
(CD) and the standby dictionary (SD). For example^ codewords repjesentina every single character possible 
in the uncompressed input data are put into the dictionaries. Alte^ could be emp- 

ty. The encoding of character strings from the data sequence, is implemented, using any desired encoding 
scheme. , t \\ <-.,.•■*:-..■. ^ -r. 

In block 10, input data is compared with previously encoded data entries/of a current dictionary to deter- 
mine.whether the character^tring and any of the dictionary data entri s match. Block 12 stores an unmatched 
character string as a new encoded data entry in the current dictionary.^ When a match can no longer be ex- 
tended the code for the longest matched string \s, output at block 1 3. 1 

Block 14 stores a subset , f th pr viously ncoded data entries of th current dictionary (CD) in the standby 
dictbnary (SD). The subset selection process in block 14, as stated abov , is alt rable.for.specific input data 
to produce the highest compression ratio with a given number of data entries in the standby dictionary. For 
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example, data entries for the standby dictionary can be selected based on the number of times an input char- 
acter string matches a data ntry within the dic«6ha>y. Alternatively, th standby dictionary subset can be se- 
lected according to the number of input characters represented by the Encoded character string^ In general, 
arty preference scheme can be applied at this i sCa^e'J ' * ; " . 
5 Decision blqck 1 6 determines if a dictionary reset is required. For example, a reset is required when the 

currenj dictionary reaches a predetermined numbered encoded character string entries or when the compres- 
sion ratio has fallen below a given perfonnance threshold. If the current dictioharV dc^ jiot need to be reset, 
the compression engine reads a new character string and the process returns to block iO. if the current dic- 
tionary is reset, block 18 then replaces the current dictionary wiifi the entries in the standby dictionary, initial- 
to izes a new standby dictionary, reads a new character string and theri returns to block ibi 

A dictionary based compressiori/decom^ession method ; according to FIG. i can be used to generate a 
static customized current 'dictibriary that isused ; tac^ ^ data I sample of the input 

' data sequence is selected. The current dirtionaV^ re^atediyVepiaang the current dic- 

} tionary with the standby dictionary. The customfeed current diajonary is" Wen locked in Vread-oriiy function 
15 and us>d 'by the r <^^ ^ sequence. 

; FIG. 2 is a detailed 'dataf ^low diagram-illfetratin^bn^ example of a ^da^cbmjpression algorithm that utilizes 
a current and standby idiciiohary; ^ data string is copied into the standby 

dictionary when the input data string matches an entry in the current dictionary. This procedurje assures that 
the data string has been "seen" at least twice in the input Th^ Went and standby dictionary are switched 
20 when the current dictionary is full (e.g., reached a predetermined number of valid data entries): As ^ mentioned 
above; alternative dictionary swit^ are easi|y 

implemented according to specific application requirements. : - ; ! ^ v * ^ r - hc ■: 

An input data string is comparedwith dita entries ; biF the current dictionary in block 20. Decision block 22 
branches to bfocks 23 and 24 if there^ in tne current dic- 

25 tionary. The longest matched data string is then encoded and output at block 23. v - i: : K ' k v ' 

Decision block '24 detefmiries if the current ftcf^ dictibnary is npt full, block 28 

stores the data string as a data entry in the current dictionary. If the current dictionary isfiili; block 26 switches 
the current dictionary with the standby dict^ stored in the ; new current directory. 

Since the cuirerit dictionary is now repfe 
30 by dictionary) there is now space available to store new data strings. Block*3b increments (trie address counter 
1 1 of the current dictionary. A new input character^ r&ad from the input 'data iff block then the' compare process 
in block 20 is repeated. - ^v ■ 

When decision block 2i determined ^ entry in the current 

dictionary, decision block 36 checks to deterriineiT ffiedata st nrigha^ prev^usJ^beWsto^^ the standby 
35 directory. If the data string has not been previously dictionary, a flag is set in a status 

field within the current dictionary." Alternate set in any case, eliminating block 3&The 

flag is associated with the current dictionary data entry that matched the data string. The flag indicates to the 
compression engine that the data entry has previously been copied into {he Standby directory; This prevents 
multiple copying of the same data entry into the standby dictionary. Block 40 writes the data string into the 
4o ; stahdby dictiortary and block 42 ihcreme^ the data string did 

match ^f h a ?^. e ? 1 ^ in the current dfctionaryi bldck 44 adds'the riext ln^ut character to the' present : <Jata 
string and returns "tblilock^d: If aecision block 36 Indicates thai me' da^' st rfng^as previously been stored in 
the standby dictionary, the process goes directly to block 44 and continues as described above. 

FIG. 3 shows an example implementation of the invention in a data compressor/decompressor integrated 
45 circuit (IC) 50 which is a presently: preferred cd m^e^or/dero^ (DCD) IC 50 

includes a data compress ton/decompression engine 52 in combination with a data compressor interface circuit 
^ ^ e P?P J c 5 ? *® iSed in combination with dictioriar/~f random access memory 

(RAM) 88 and dictionary 2 (D2) comprised conveniently imple- 

mented in a single; IC or iaV separate ICs 50; 88; and 90. Di and D2 are illustrated as RAIvIs but'can be con- 
. * . venieritly ^piemehted in content addressable memory or any alternative merhory structure. The RAM is con- 
ventional: Each RAM dictionary memory location in DI and 02 includes a data entry field (datalentry) 94 and 
98, respectively, and a standby status field (stdby^stat) 92 and 9$, respectively. 

The data entry field stores unique data strings occurring in the input data sequence. Th "standby status 
field includes a standby dictionary status flag that indicates whether the data entry in the current dictionary 
55 * E has previously been stored In the standby dictionary. The standby status field can conveniently include a 
dict_valid field for identifying valid data' ntries. The use of a multi-bit dict_vaJia fieldin a data compression 
system is described in ccrnmbhly assigned U.S. Patent application entitled DICTIONARY RESET PERFOR- 
' 7 MANCE ENHANCEMENT FOR DA APPLICATIONS, Serial No. 07/766^475 filing date 
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9/25/91, and is Incorporated by p reference (EP-A-053471 3). 

The data compression ermine 52 is preferably designed to implement the 122 or LZ1 compression algo- 
rithm, but can be designed to implement any suitable dictionary-based compression scheme. Also, if desired, 
the compression engine may incorporate or be used in conjunction with automatic means for controlling di& 
5 tionary reset, such as disclosed in U.S. Pat No. 4,847,619. Being otherwise conventional, the particular al- 
gorithm and architecture of the data compression engine need not be further described. 
The data compress^ 

monitors dictionary reset request signals 72 and data string match signals 70 which are output from the data 
compression engine 52 or from other circuitry associated with the data cbmpressbr/decomrjressor IC. Subctr- 
10 cuit 76 controls which RAM operates; as the current dirtionary and standby dictionary. This subarcuit reads 
the stdby^stat field from the curreht dictionary tb^etermine if the present encoded data entry has previously 
beenOTpiedirito^ :■>=.. 

Address generator ci^ the biriary values of the data_entry field for the dictionary 

operating in the standby rrwde. X'typ^ binary counter but other forms 

of sequence can be readily used: Associated With the subcircuits are multiplexers 66 and 58 arid transceivers 
; 84 and 86. Multiplexer 56 selects between read/^ s^nais 60 arid 62, respectively, from the data compres- 
ston engine 52 and the sv/itch controller circuit 76; : Multiplexer 58 selects between address signals 64 and 66, 
r^pectively, f rom the data com^ression ehgihe 52 and address generator 68. Transceiver 86 operates as a 
bus controller sfelecting betweeri . either data_ehtry field 94 or 98 to connect to data bus 87 connected to data 
" :. bom P resfsio "! en 9'ne ^ stdbyjstat fields 92 or 96 for connecting to switch 

■ controller 76; The multiplexers and transceivers Tare controlled by control signal 78 and address generator 68 
is controlled by cohtrdrsigrial 82: Both from the switch controller circuit 76. 

* t DCC> ^ cto ^ 50 R 6 ^' 13 ^nvent^ one of the dictionaries (current dictionary) and 

thff data ^compression enSine'52 ;r duririg n^^^ operations: The system also al- 

lows the standby dictionary to recurve data Yrothrda^^hi'fMssibn ehgine 52 or directly from the current dic- 
tionary to create the data entry subset in ^ ; '■• 

In operation of circuit 50; sWitch controller 76 sele^ between D1 and D2 as the current dictionary, for 
example D1: The expression engine then begins a data compression, readiii^ and writing encod- 

- ; , ed < ? atato the data^entry f ield 94 of D1. When the cbmpresisibn aigorithm determines that an encoded data 
; string is a candidate for writing irito the stohd by dictionary (D2), for example, when an encoded data string 
matches a data entry within the current dictionary (D1), match signal 70 is activated. Switch controller 76 there- 
by chec^^^^ 

; copied info the standby dictionary/if riot; trie data i string is written into the dictjentry field 98 of D2, at the lo- 
^^" P r ?Y j ^ ed ^address generator 68. In addition; the stdby_stat field 92 in the current dictionary is "set" 
; bjr switch controller 76, to prevent the same encoded data string from being copied into the standby dictionary 

'-'twice.' '' v " *' "--^ ■■■'■■> - 4 -:■ J O-^r: ..^ v ■: • . / ■ . ■ 

. . - ^ en ^ ata f wm P r ®^ on en activates reset signaJ 72i swrtch controller 76 alters the value of control 

signal 78. The newcoritrol signal; changes the connections for the multiplexers and transceivers so that D2 is 
ndw operating as re current dictionary and D1 is rk>w 'operating as the standby dictionary. The subset of data 
entries loaded lnto p2 Is men used as the initial set of data entries for oppression engine 52. thus, when 
the data'compressioneng^ infid; : 3 is ; reset. WdicTentry field of the new dictionary contains 

a high compr^sidn ratios^ >- - ! > : ■ 

/ Switch controller 76 can be shut off by the data compression engjirie by activating a specific combination 
of match .signal 70 and reset signal 72: This Allows the data i ebmpression engine to read/write encoded data 
exclusively to/from a single dictionary. This is used for the customized data dictionary operation described 
above and for corn^ : : : = , 

The above-described method has proven to. be advantageous. For exarripleySSO files containing user op- 
erating manuals were cornpressisd usirig t h e s ta ndard U N IX "compress" cbmma hd (a traditional implementa- 
tion of LZ2). Then, a customized dictionary was built using the above-described current/standby dictionary 
; method and the files were then compressed using the customized' dic^onary created from the data sample. 
The results are summarized below: : ' * : v ' 

Original file size: 6,602,300 bytes . : : , . 

Unix compress: , 2,781,686 bytes : - 
Customized dictionary: 2,025,742 bytes / 
55 Compression improvement: 37% 

Therefore, the aistorhized dictionary pr vides a substantial coimpressi ri improvemerit over prior compression 
methods. ........ iW ; . ... v ^ 

This aspect of the 'invention can be modified in arrangement and detail without departing from its basic 
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principles. For example, it is possible to implementthe current and.standby dictionaries both on the same RAM 
■ behaving a field that indicates whether the : entry is in the standby dictionary or not Upon reset, all non-standby 
^dicttonary.entries are cleared. The address. gen|natiorvc more cpmplicajed since, after reset, entries 

are not in consecutive locations. This approach i£WeJ| r suited for a content addressable .memory (CAM) imple- 
mentation as described below. , I t v ^ : .. - f v. 

, . 11. Memory .Circuit For Lossless Data Compressors 

v R, G r * te a diagram showing the overall an;angerne.nt of a circuit 136 for a .QAM compression/decom- 
pression system.a^^ second aspect pf the ^ compres- 
sipn/decompressipn (CD) engipe .142; an uncompr^eddate a,c^ressed data interface 148, 
and a processor interface 15Z The CD engine 142 comprises ^ logic ue [ 
The uncompressed data interface.!^ 

pressed interface 1^8, transfers cc™pres>^ signals 
for interfaces 138 and 14$ ara^ °ver Qpnj^ contains 
a First-[n/First-put .data buiffer (^|Fp),1^ V^ 

Thedrcuit.13?can ^use^jneithera^p 
between mcde^^fcfbid^ 

pressor with sjmpjjfied dedicated ? deco^^ circuit with : a . r^M repla^cinq; bto and 194 

*n FIG. 5, The fpilowing de^ 

!n the. compression m^ 138 r.eceivesjun 

from the data bus _15$ ; and supplies the^i $a 142 . The 

string-feWe mernp^ 

that are output on^ata bus 158 via^ata ^ interface 
148 receives compressed data cod^^^ ^th^cp engine 142 via 

data buffer 150. String table 144 in ^oc£era^ topic ^ U^decpmpress th^ into 
character strings and.outpyt the ; 

trpis registers for setting indirection P^ato f r^c^erand'controls other 
miscellaneous functions throug^ prpc^ ' 7^ -* - • ^ 1 , ^1"! .^ 

PIG. 5 is a detailed biqcK diagram o^]^ the control Jogip ; 146Ja^iG. 4. The 

string table memory comprises an as^ocja^ (Q^M) 188 with 

additional internal logic th^t re^uce^ corpp^^ jtime.^he ,CAM! ^88 ; is organized 

into "words" (e.g., 38?2 x 20 bits) \whe^ separate xhar^c^ string entry^Data is written 

into memory 188ona data bus 190^0^^ jl'gp/rece^Vne^ (K) input 

on bus 180 and an encoded chara^^^ ex . 
ternal characters on bus 180 come from the uncompressed data stream on l^WDAtX bus ^SA ffflG. 4) and 
the codewords come from the output of encoder 1 94. A data inpytselept J.c^ic^circujt 182 fl trjr^ugh multiplexer 
192, controls:, which bits Pf PATAJN 

on ^ s 202. The data signa , 
input 164^ both frojn control |pg.!s.146 (pl<£ ty, ^and^ rn^cti signarinpwM Wpr^^cqder 494! ,^ \ 

match signal associated with each Y^r^ one 
pfthe data entries in memp^ Encoder 
194 encodes, all H^atch sign^ bus 202. The 

codeword, is thereby Encoder, 194 
also generates a match signal 168 that \. activate^ w^e'n any data entry Jn memory 18JB* is matched with the 
character string, on data.bus 190. .... .1 r ,' '.' , ^ J. > 

; An address deppder ?MseJertiyely,re^ address 
bus 1 77, the internal phararter. string output f rom m>mo^ 86, or. an internal address from 

an address generator ^ the asspciatW via word select lines 204 (e.g., 264 through 

4095). The external compressed characters oh bus 177 come from the cprppressed.data stream on COM PDA- 
TA bus 158 (FIG. 4). The internal address generator 170 is controlled by match signal 168 from encoder 194, 
search signal 178, a read/write signal 164, and a reset signal 162. The read/write and reseUignal come from 
control logic 146 (FIG. 4). The address generator includes a counter which is reset (e.g., to 264) upon initiali- 
zation and subsequently incremented as the dictionary is built up,^ ^ " 
_ The source of the address supplied to address decoder 184. is controlled t>y read sel ct logic 172 and the 
read/write signal 164 through multiplexers 176 arid 174 respectively. Read select logic 172 is controlled by 
reset signal 162 and the. compression status of the data entry output 186 from memory 188. The data'entry 
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. compression status can b determined by the value of data entry characters. For example, values greater than 
256 may be allocated for. encoded character strings and values less than 256 may comprise single data char- 
. peters. Multiplexer 176 (MUX1 j selects^an input from either bus 177 or bus 186 and multiplexer 174 (MUX2) 
selects between the output of mux 176 and the output of address generator 170. Decoder 184 also includes 
5 an automatic update feature described telowjihat allows a data search and memory update to be performed 
in the same memory access cycle. ' , : 

FIG. 6 is a logic f diagrarn of a preferred implementation of the automatic update feature of address decoder 
184 in RG. 5. Each address (>I)bRNI19:0]) input into address decoder 184 from MUX2 174 is fed into two 
AND gates. AND gates 208 and 214 illustrate a single address line. AND gate 208 is also fed search signal 
10 1 78 (FIG. 5) and the inverted value (NOMATCH) of match signal 1 68 (FIG. 5). The search signal is also inverted 
and fed into AND gate 214 along with a "qualified" write signal. OR gate 212 receives the outputs from the 
two AND gates and generates a word select signal (WORDN) The equivalent function can be provided by mul- 
tiplexer^^ other combinational .... , 
x ™ e : W$ a ^ e %?^tis ?ctrvated,when a data search is performed during data compression operation. If a 
# . l ^ty ^ d V^ n ?, a $^4 : compression, the character string must be placed into the next available address in 
.... memory. To eliminate the addrtipnal clock cycle necessary to write the data word into memory after a data 
: ; , ?®? rc? ^»,r!?^ e 9Pes hig^i if 3 n^tcti cJoes L jiot <>ccwn §ince the. pharacter string is already on data bus 190 
: i ■*!■■. ^ ^ d |E? s ? next 'ayaiiaWe addre by address generator 170, a write can be per- 
formed immediately af jer a match indication occurs. Jhus, the inverted match signal NOMAtCH activates gate 
• . M ^■^^^^}^..^ < ^.^ : (VVORDNj asscK?iated with the next aya^lable^ rriemory location^ 
1;* -\ - ■ ;.;!?. a . m ^??? ! s ^h^ ^y nn 9 tne , search operation, the wcrdseleci line is disabled no write operation takes 
; place. The qualified write^sign^ force data writes even when no match occurs in memory, for ex- 
: amp,e » ^"nng an external, mic^processor write operation. This\update feature provides true 1 cycle per byte 
.... u • P er ^?^n^, s ; inc8 dictionary writes are " an extra memory access. 
: ^ , : ■ In *^^'^, at ^ e »;^ e ^^'W". ^9 6 ' mav 'h? to set a "data_valjd" field within memory. For example, 
the systernjn FIG; § can copy each new crjaratf e> string into membryjprior to checking for a match in memory. 
If a match does occur, the WORDN signal is then used to activate a "data_valid" field associated with the newly 
_ , . stpre^data , , . ...... 

.. 30 Data Compression ' ' \! ' 'V. 

; . ; !h. operation of ci^ shown) initializes the system for com- 

pression and reset s memory 1 88. The microprocessor ^ contrpl signals (search signal 178, read/write signal 1 64, 
: • > . . re ? et ^9 n ^ I.S?) corne from the unepmpressed data interface 152 yia control logic. 146 (FIG. 4). The reset line 
• W a Yte operations. For example, the reset line Is coupled to memory 188 to reset 

1 v ^iate^yaM each memory location. In addition, me reset line initializes the address 

j > :? e " era *^, t?, a st ?#' n 9 niempry location for storing c^ 

*\- ^?Y e ^: di ff er ® n t ^^""WMes^ay us^ for;ipitiaJizing single input characters. For example, single input 
*.v;-v= rr. ?. , } ara< ^ e r® ff^y ^ a !R0^ r > m ^^y encoded as part of the compressed data stream. Alternatively, a set of en- 
\& ] << Fff^y^^^M.^P!^^^ anyjsingle input ^jd|ate, character may be loaded Into memory. 

: : : T 0 ®^ 63 ^^!!" 6 184. d^9*s mux 174. to cprmect the address provided by address generator 170 to ad- 
• vs i\ , dre «s decoder 1 94. An external character string from uncompressed date interface 1 38 (FIG. 4) is supplied to 
; , ... the, byte field (DATA-IN (7:01) and the codeword field (DATAJN [1 9;8J> of bus 190. Search signal 178 is then 
: ? cti Y at ?d» ^, causin 9 memory i?8 to compare the i cc^ewp^byte string w memory 188. No 

.fs ... match wili initially occur since nothing has been previously wn ; tten into rrWmofy 1 8?. Therefore, the codeword- 
/byte string on bus 190 is, written into the first available address location ,in memory 188 (e.g., the initialized 
v: address, generated py address generator , 170). Address generator 170,is then incremented and a new input 
, character from bus 180 is read into the byte field of the memory data input ^ continuing 
, . : to write unmatched codeword/byte strings into memory 188. . ..■>,.,;: 
" ?° Qn a successful match, input data select logic 182 directs multiplexer 192 to place the codeword generated 
: from encoder 194 in the codeword field of data bus 190 (DATA_IN [1 9:8]). A new external character from bus 
180 is then fed into the byte field (DATAJN [7:0]) of data bus 190. the codeword thereby represents the pre- 
yiously matched character string, $ecaus the codeword assigned to the character string is derived directly 
from the matched data entry address, significantly less control logic is required to encode input characters. 
55 In addition, by feeding the codeword back into multiplexer 192 (MUX 3) and combining the codeword with the 
next input charact r, an input character can b processed each clock cycle. 

Th new cod w rd/byt string is then compared with th data ntries within memory 188. The proc ssis 
... r peated until no match is found. At this point, the compressor outputsthe codeword from the last match and 
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^s the n^code^yte ^g into memory 188: Th^ la^ input character (K) fed into the byte field is 
then compared with the updated dictionary (in Urease of dictionary initialized to contain "root" colwords) 
men^ r V"!* * Se-erating a root c£j£ to 22 

r?^!L ?T eXterria ' « h ^:(KyM^.^,186.|s l then fed into the byte field and^e 

^ t a n r ^^*i*!y. the last character K can be ou^tt- 

9 U2) " tHe * K <*" be as the *» K following OMEGA. 

r-tE ^' Ctl °? ar * fi, f "P; address generator WO activates a .able-full signal 198nha? indicates to the 
•£ Re compression systen, (RG. 4) that no further charaderistrings can be written into memory Any ad- 
. drt.ona| ,nput data b then pompr^i accord^ to the preseirt entrted ^ within memorT m 

, Data Decompression * ' -" r : ' ■ <: ' i;J ; - - ./*■': '•■ " < v;-> > = ;-:A < . . •. \ 

o^S^ d ^ P ^. i ^ i ^,^'^ i n* e operation starts bj? r^ttlhtrnemoVy^S^hd initializing the 

* ^f^^si^refefj^n^ess ^v^^t^^^km^ stSng 
slocated(e.g a-mot" codeword 6faH 

the value of the codeword or .n the alternative wit* 1 an identifier bit within ^eriiory. • - 

1 Wi°H ?K l'^ W^#-^M>*^-»5a?^ thememorV is ; rea<(; and (assuming 

*& ft^^EEH^ £ ^^l!^'!^.atf*UN) stack inside citro. logic 
146 (FIG 4). The codeword field (DATA_OUTJ19!8]) of bus 186- ffiftiWoftcoo^W-tt^ bacfcfe address 

back. *e last byte of thedateentryreadfromrh^^ 
- ^ « ^ W- cdd^rd; a t^h;tlme anew bbd<^6rd ^read^^c^lS^s 

exte^n^' ri eW0,d " id ! nUf ied * ' aSt deC ° ded CharaCter 0Ut P ut is conMfet Mlftlie previous 
SSftl £ ?- 1* J re3d int ° * e n6Xt aVai,aWe address in memor y 188 - R ^d select logic 172 
th Sta omt k ,»r: d l and direCtS mUlCplexer 176 accord in9'y t° connect externa. address bus^77 or 

decompressed decoded characters on bus IS^*"'"'-' «*»-••>•• ■■> S't J • v;: Vi',!,:o;': ;:••-...(.■•» .• v ^vi< 

^'-'i^^K-^^iintod list 

and a ho,thens ^ 
simpjf.e^usjnga^co^ 

.i«r V metnod 'tfaatacbmpress^ or linked 

T^f. 51 ?! * sed da ^'«naso be Implemented using the present system: 

the ?" d dasheabiock 234 is the decompression process for 

a^^XSTT 1 data (K) «° .^ioh block 22fraidng with the coded char- 

I?^1n« f (OMEGA; otrtputfrom block 228. As noted ab6Ve; bMEOArepresehts an address of a data entry 
ES^^T? «WOJK«fc*d K are' concatenated together and ^cornpared with the entries WithiJ 
tfeiS^-^- W- 2 ^^ OI « EG ^ '."Put matches an entry irt rrtemory. block ^228 encodes 

anSSAT. T^^^^> d ^^^^^^^ fe P^^ P ateduntn 

S3 S^,* vi,° U ri S ° MEGA - antffeedS the Character K >"to coding block 228. K is encoded 
226^ -, and ^'f" 3 !; d .wrth^e ne xt external data character K before bing fed back into decision block 

' The encoded data, OMEGA, is sent to block 236 for decompression.' A'g'iven encoded input character 
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■ (OMEGA(i)) is used as an address for accessing the string table memory. Decision block 238 determin s if 
the data entry at the address OMEGA(i) is a root character. If it is; there are no additional encoded characters 
in the data entry output from memory (e.g.; OMEGA(j) does not exist). The memory data entry for K is then 
output as a decompressed output character orVlihe"246. Decision block 238 jumps to block 240 where the pre- 

5 vious encoded character (OMEGA(M)) is concatenated with K and written into the next available memory ad- 
dress location. Block 242 then directs block 236 to use the next encoded character (OMEGA(i+1)) in the input 
stream as the address location the next data entry read from rr^mory. . -r — 

rf the output from the string table memory is riot a root (e.g., the output comprises an encoded character 
(OMEGA(j) and a decoded character K) t K is output on line 246 and decision block 238 jumps to block 244. 
10 Block 244 uses the encoded character (OMEGA(j)) as the address for the next data entry output from memory. 
The data entry at memory location OMEGA(j) is then processed as described above. The process is repeated 
< until every encoded input character is decompressed:' • - , , v 

= FlGr8 is a detailed data flow diagram cf dasfted t)l6ck232 in FIG: 7: The data compression process begins 
; when a start or re^tsignat is instigated in block 248. A memory circuit (described below), is initialized in block 
•ts ; 250, for example, to operate* in the compression or dec»rnpfession rnode and tbreset the dictionary. Any dk> 
tionary valid bits need to be initialized; preferably 
" Character codewords or with a set of codewords externally generated in accordance with a selected coding 
'-"-algorithm Stan- 
dard, paired with a null codeword to identif y the entry as a singlecharacter or "root" cbdewbrd. Alternatively, 
20 ' : rather than pre-stbring I'set'of>codew^§-'th6y-cdu1d be generated real time each time a match fails, for ex- 
ample, as disclosed in commonly-assigned U.S. Pat No. 5,142,282, on Data Compression Dictionary Access 
Minimization. Other initialization schemes can be used, : including an empty dictionary. * 

The first character in an input date stream is ! read-in block 252 and either stored directly in the OMEGA 
1 field Of encoded (e.g., CODE(CHAR)) then stored in the OMEGA field. Then, the next input character (K) in 
25 the input data stream is read in bib^^ 

* as a character string (i.e.* concatenates OMEGA-K)" ahd then; searches the' dictionary for a data entry that 
- matches the OMEGA-K string. Since no idata i string tias-yet been' stored in the dictionary, decision block 260 
• v . * indicates that there is no match. Since the OMEGA^K string is not presently represented; It is stored in memory 
if decision block 266 determines there fe availaWe^storage space- If the memory is not full, the operation in 
30 T block 268 automatically loads the OMEGA-K string into the next available memory storage location (ADDR(N)). 
■ Block 270 then increrrterits an address counter tb identify the next available storage location in memory 
^ (ADDR(N+1)); The encoded r value OMEGA (an adbVess) for trie first input character, if applicable, is output as 
me f irst character in the encoded data string in block 272: -■ ; * 

■ ■ c-.v When the memory is full, m^ : compilass^ H sys1em'ca , ri' simply be disabledf rbm writing any additional char- 
: 55 - acteirstrings^rntomernor/. Fbrexample, if dedsibh block 266 determines that the memory 1s full, the character 

string loading step: of block 268 and me address counter ihcrerhentihg step of block 270 are skipped and the 
* process jumps to the- encoding and output process of block 272, further described below. 
: : Af ter (3MEGA is butpiit, the step of block 274 replaces the f irst input character (OMEGA) with trie second 
input character (K) or code(K). The next input character from the input data stream is then read (K) thereby 
' to ■ J 7 providing the next OMEGA-K string. The process then loops back tb block 258 where the memory is searched 
■•■'-^'withthenewbMEGA-k string/ 1 : *--" : ' : '"' ri - : '^--.v-:;;: ■<-.. v 

If a match Vindicated by decision block 260, the process jumps' to ^ is 
' ' • ^ replaced with an encoded value representing the' OMEGA-K string, which is -equal 'tb the match address. The 
" ; next input characterf rbm the data stream is then' copied into the K field, the OMEGA and K fields are combined, 
45 '"forming a hew OMEGA-K string which now" represents three input characters. The process returns to block 
- 258 where dictionary data entries are compared witJh i the hew character string. Additional input characters are 
added to the^character string as long as the previous character string matches ';a data entry in memory. When 
■• a new character string no longer matches a data entry, decision block 260 jumps to block 266 where the mem- 
ory update procedures of blocks 266, 268, and 270 are performed as described above. Block 272 outputs the 
so Value OMEGA (e.gi, the encoded character string from i theiast input character string/data entry match). Block 
274 takes the last character in the character string ( ,g., the character that caused the character string to no 
longer match any data entry in the string table) and copies it into the OMEGA field. Block 274 then copies the 
next input charact r from th input data stream int the Kfi Id and th proc ss loops back to block 258. The 
character string is thereby compressed since the singl encoded value of OMEGA utput from the compression 
55 process represents multipl input characters. , f . . , 

FIG. 9 is a detailed fl w diagram 1 of decompression 1 circuit 246 in^FIG; 7. Block 276 initializes th string 
table memory for decompression. Block 278 gets the first encoded word (OLDWOFtD). if ho more 1 data-is avail- 
abl during this or any subsequ nt input read step, theri'the process is exited. Th first encoded word is de- 
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. T ,cpded at block.280, either aJgorithmically or by reading a preloaded entry in the string table memory, The first 

,j encoded word is a root character and is therefpre decoded and output, : - . 

Blodc 282. gets the next encoded ^.^^i9n^ : bio«284-M8U-N<bDE as theatres* of the data 
,entry output by the string table. Initially, in one.imjMementa.tion, the string table will consist of only single char- 

„ acter bytes, so.block 284 willoutput a byte K ByJe.Jfcisthen output in block 266. In later cycles, block 284 win 
return QMEGA-K as discussed further, below. . ;f r.r ; i>- :.-<-,-> -. I ,' 
. ^.^ifi.-" ^ ,OCk 288 determin ^ s whether the byte is^h^ er»d of a string (e.g;..rpQtchairacte^and. if so. jumps 
^ btock 292. Block 292 builds a new data entry in,the next available address-in the string table which consists 
pf the concatenation of the first encoded input word (QLOCODE) and jhe last byte output (K)! Block 294 points 
.to the next unused address location and block 296 replaces OL0WORDwith;thelast encoded input word (IN- 

.-. CODE) and returns to block 282. - . * ... , . .. v ..;/ ;t ; /;'..,",.,>.•:.•.. - ■■ '.<, . . .... , 

Block 282 reads the next encoded input wp^iNGODEy.and the ad- 

,.dress INCODE. If the.data ert^^ a decoded byte K and 

,; «a codeword f ield.ppinting tp-a next ^ address ^fgrther decpding?.<0!^GA),Block 286 willithen output K and 
decision block 288 will jump to ptak^ address Qf ^ 

next-dat ? .entry;OutBMt fi;om tha^n»^:and 3 tlwJop W :Mck*.bloek.284l : The process, is repeated until 
•• J?;"* 8 - entry string table contains a;ropt character (i.e.ifethe end of airing) - Decision block 

288 t,he,n.proceeds ;.ljt block 292;.wherejhe preyiously read enpoded. word(OLDCODE) ; is concatenated with 
the last output byte(KkThe,f unions hi Wpcks,294,and 29,$ are thpn,performe.d and thenthe process returns 
-: to *!p*2$2,*toi* .thedec^rnpression prcces^T^er^tes the original data stream cpmpressed>in the com- 
. pressipn process of Fip.^-, : y,, x > - ■.<> ?; . . ;i ,; c .,:, > .. : ,,,,;:"!.!Z^r'!v„ 

FIG. 1 0 is a graphical depiction of the cprnpres^ipn and deccmpiressiprt algortthms in Fl^. 8 and 9. A raw 
data stream 3QO.c»rnprises an uncqmpress^dsWr^of ^aracJers which are inpufctp the datapompression/de- 
cpmprpssipn process illustrated: imFI.G,?j Insihis {example, single qharaptecs B,l,N;>and: J Jiave been loaded 
d"™g*nitia^^ 

Characters are = encoded,b^^ 

concession speed, single input cbaractprs^pan Jp^>encojled,algprithmically prior to, Initiating ,the,process de- 
scribed below. ^m^^m^m|ts^^.m9l^m^M 5tateimmediately. afterjnftializationand memory 
' v302B illustrates tte.dlcuonar y afte^^oinp^sj^J^ft^^te.r. &t. <•/ : -->; ; - - - i • 

The first input .'9ha^cter.R,.,f^^^rBa^icBU rrat^esime-,data.entry;at address; location ADDR1. 
Since there was a.match,,the compression system concatenates the encoded value for R fAddr0=P) with the 
.next input character y, and memory 302* is ; sea^ct|§d fiir.a 'pi? match.. Because there is no ^Prpatoh in mem- 
ory. "01* is written into the next available rnemor>vipcation (ADPR4),, gs.ij|ustrated J jn.fl)emory 302B The co- 
d «*!»rd for me largest rr^fchedseoMa 

>n compressed, charac^ tne str ing:com- 

pnsmg the encoded value. for, T ? <i.e^APDR^^^ input character "N". ; Since the 

stnng "1N" is n.pt in. the.dictionary. rt^ written intp the t next available mejnor.y location (ADDRS). as shown in 
.3p2B,The .value 1 (e. gil Jast matched character ;S.tripg = is o^t 3 s*he second encoded character in com- 
pressed chara<tarstream304. .., {; .-, v:j ,.„■•...; • t- 

•r .,' .The Process contlnu^s^ until the 

second T in the uncompressed character stream 300 is processed (e;g,. character 306), The compression 

■^y^ WP^fTr^lteXtiW-Jj, since. T is .located at .address location ADDRJI . The encoded value 1 is 
concatenated with,.thej<ext,inpMt entries in memory 

302B. Since the ^equence^N? has, stream 300. the string ?1NP matches an 

entry in mernpry i^9-^m^^^^^.^^,"i»^ thereforeincoded as "jsr and concatenated.with 
the next input character fP. SJnce.the string ?5T» does, rot match any entry in memory 302B. "ST? is written 
into the.nextavajlable 2 address locatipn^DDRa) .arid^the codeword. fof . the last matched character string "5" 
^output in character stream 30.4, The encop^yalge for input cjiara^ (ADDR3=3) is then concatenated 
,with the next input character.fl^apd the^prpcess is.reReated, .Me.mojy.3Q2B shows.all characters built for the 
dictionary from, character stream^OO.; Character;stream aofjs. the cpmpiete: compressed character stream 
for raw date.strearn.300, ^qtice.that only six encodedicharacters are required to represent the nine characters 
in character stream 300. ;. > 

/ Th decompressor dictionary is reinitialized ^decompression as illustrated in 302C so the first four ad- 
dress locations contein the.decpded yalues.for the single, input characters R. I, N. and T respectively. Again, 
single character decoding may also be preformed algorithmically. The first encoded Input character "0" is used 
asanaddressinto^ ssj n system determines that th value "O" is a root codeword, 

for examplp. by.checking that the yalue.is less than 4. The data entry at ADDR0 (e.g., W ,) is thereby output 
as the first character in.decompressed character stream 308. -The decompression system th n reads th next 
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encoded input character "1". Thisvalue is again a root codeword and therefore the data entry at ADDR1 is 
output as the second character T in decompressed character stream 308. 

M this point a new dictionary entry is built using the last decompressed character T concatenated with 
< the previous codeword *0\ The string "Ol? is then written into the next available address location (ADDR4), as 
5 shown in memory 302D. The next codeword "2* is input and the process is repeated. This time the data entry 
ataddress location ADDR2 (e.g., N) is output and then the string "1 N" is written into memory at address location 
' ADDR5. ■ , ... \ _ ..: : 

The process is repeated in the same manner until input character "5" Is read by the decompression system. 
The decompression engine uses this codeword to reference the data entry at ADDR5. The encoded character 
w "5" is not a root since it is greater than three, therefore, the decompression system outputs the last byte of the 
data entry at address location AODR5 (e.g., "N");: The rest of the data entry (e.g;, "1") is used as the next ad- 
dress Since the codeword "T is a root t nhe data entry at ADDR1l (e.g- T) is output and no further decom- 
pression is retired; The decompressed charaetere character stream 308: A new data 
' entry ; in memory is written into address location ADDR7 using the last decompressed output character V and 
15 the previous encoded input character "3\ The process is repeated until all characters in character stream 304 
are decompressed. It will be noted that dictionaries built using the HP-DC scheme with hashing are different 
r y In contrast, the compression and decompression dictionaries 302B and 302D built by the present system and 
-..v! ■.{*■ method have identical addresses/entries.- : < -y "; t 

20 ■ : llh Using Multiple- Dictionaries in a CAM Compression/Decompression System ; 



To further reduce the amount of memory required to.compress data using a CAM, the CAM data compres- 
sion system previously illustrated in FIG. 5 is. used in conjunction with a standby dictionary (see FIG. 3). The 
-CAM, while haying the capacily.to process one character each clock cycle, can now compress data using mini- 
25 mal memory. In addition, the=data compression ratio is increased by maintaining a useful set of character 
- strings iri the current dictionary after a reset The method illustrated below © adaptive whereby the dictionary 
Js embedded; in the codeSwords so that a separate dictionary does not have to be transferred before each de- 
compression process. 'r : >; .... .. 

FIG. 11 is a high level block diagram of the combined CAM multi-dictionary compression/decompression 
30 system. For illustrative purposes, the system =js jmpiemented using a 2»> x (b + m +2) CAM 312 similar to that 
illustrated in FIG. 5. The CAM 312 comprises a control bus 314 coupled to a control processor (not shown). 
• ; - An address bus 316 (b-b*its wide) and- a data -bus' 318 (n-bits wide) are coupled to CAM 312. The zero bits of 
- a n-bit wide DATA_MASK'bus 320 disable*the corresponding bits during a CAM search. For example, a °0 n 
signal on the first mask bit (DATA^MASKfOJ) disabtesthe first OATAJN bit (DATAJNp)]) fed into CAM 31 Z A 
35 ; V' ; disabled DATAJN bit is not taken into account when searching CAM 312 for a data entry that matches the 
>-> 'Signal-on line 318. Data masking circuits are well known in the art Therefore, the details of the masking drcuit 
< ' used in CAM 31 2 will not be shown in detail, A match success line 322 goes active whenever the data on bus 
^ ; 318 matches^ previdusly stored entry in CAM 31 2, MATCH_ADDRESS bus 326 contains the address of a 
matched data entry and DATA_OUT line 324 is used to output data entries previously stored in CAM 31 2. 
t'4d ?i FIG. t2 shows the different fields contained within each dictionary entry In the CAM. Each CAM data entry 
) v>i - has three fields: a character field (CHAR) which is nvbits wide for storing the^suff be character K, a code field 
; v a< (CODE) b-bits wide for storing the encoded character vaJue OMEGA, and a status field (ST) two-bits wide for 
storing the dictionary status bits for the associated CODE and CHAR fields. The status field (ST) takes one 

i offour possible values as follows: , »■.;■'. :■ •/>■■ -.- -'-v. . : ; ; - ; ... 

i45 .,.-!. FREE: *The CAM memory location is presently unused in^the current dictionary; 

CD: The CAM location contains a data entry that belongs to the current dictionary, but not to the standby 
1 '• dictionary; -v. . \-.v.:. ,r *-. ]v- \: y ; \; ■ *■ 

' • SD: The CAM location contains a data entry that belongs to both the current and standby dictionaries; 

■» and ' "■■ - " .^>; ; ::; \ ; - \ r- - . y. f 'v L . 

:so inv: Invalid value, should hot occur in normal operation. : : - „ . / i 

The binary values corresponding to FREE. CD, SD and INV are not fwed. The compress r anddeconv 
' pressor operate as state machines that can be in -any on of four possible states (S); 1 where 0 £ S £ 3. The 
specific binary valu s for the status field (ST) are functions FREE(S), CD(S), SD(S), and INV(S) f the state 
(S) and are defined in FIG. 13. For example, in state S=0, if the bits [0:0] exist in the status field of a CAM 
55 data entry, that memory location is FREE and regarded as hot presently being us dinth current dictionary. 
If th (x>mpressor/decompressor system is in stat S=2, h wever;a CAM I cation with bit values [0:0] in its 
status field is regarded as a data entry that has been assigned to the standby dictionary. 

Initially; the system is in state S=0, and all the ST fields are set to [0:0] ( .g. f ST=FREE(S)). This is the 
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only time a global initialization is necessary,- as will be explained further below, minimizing, the initialization 
time delay that would occur during subsequent dictionary resets. The compressor, initially: int state S=0, starts 
reading input characters, compressing input strings and, [building, in parallel, the current dictionary (CD) and 
standby dictionary (SD). When the CD becomes full^a dictionary switch . occurs, whereby the data entries in 
the SD become the new data entries in the CD.- line; 3D » essentially emptied, rempving all valid data entries. 

The dictionary switch occurs when the system makescfhe state transition 3=0 ^ S=1. Referring to FIg! 
13, in state $=V, the free entries are those with ST=[1:0], which is the same as the CD value in state S=0. In 
state 3=1 , CD entries are those with ST =[1:1), which is the same as the SD value in state S=0.Astate transition 
occurs only when the CD becomes f ulfcso allentries;m me CAMwilJ either be marked CD.or SD (Le. no entries 
in the status field with a FREE valye and the value INY is neyenwritteiO^ejrefpre, immediately after the state 
...... transition from S=0 to S=1, all entries have, the Value BREE^r CD (with; the. except ion of the initial single-char- 
acter strings, whichare riot actually kept h!*c^^ 

with the^value 3D; so the new SD starts empty. Asjmjl3f situatloi> occurs in thelstate.transftions; S=1-»S=2, 
; . ,S=2->S=3, and S=3-+S=0.vflG. 14 illustrates ttmstete transitioachanges fbrthe ^mpre^sion/decompressiorl 
15 ■ system as described above: :jv.»- ,s ■-. v :vr r^K^-n;^ f^o^ ■ .r-V" * <>< 

v FIG. 15 illustrates a simple hardware implementation to 
initial bit values of a status register \28 are ; Hlustfatedo a in the 

status register shift cyclically so that; FREE-* INV-> SD-> €D-r> FREEi Thus, state control is simply imple- 
mented using an 8-bit cyclic shift register and shifting register 328 two bits to the left for each state change. 

In describing the CAM-based standby dictto contents of a- CAM memory location are 

denoted by a triplet (ST.CODE.CHAR), and code(A) represents the encoded value for a single-character string 
"AVFor description 1 purposes^ [codewords ace assigned Rvalues: coiTespondihg to memory address locations. 
However, codeword Values ares also easily derivedassirnp^^ wou ld 
be easily jmplemented by one skilled in the artit is assumed that the codes:(code(A)>wkriin a predef ined ad- 
dress. space (e.g r addresses 0 to 2^-1) are Jrrmdte^ 

explained previously (see F3G,5), themernory locations corresponding to thesecodesdonotneeid to physically 
exist in^e CAM, Therefore. ^ « end 
of file" conditions are also ignored. - - ^ k - 

30 Implementation of the CAM Based: MultirPifitiQnary System ^ - ... : r . iy ^ > ^ jVi; . 

FIG.;16 is a detaileqVpjrcuit.diagram^ sys _ 
tern. The circu it diagram Hi FIG. 1 S i|IU5tra.tes#ie-3cfditipnal functional components necessary to; provide multi- 
dtaionary compr^ as that 

35 . illustrated inFIG. It and the status rfsgistera28 is the same as that illus^ 

■ - regfeter342 and a MASK register 350 feed the ST; CQDE, and CHARfMcfe ihtpugh the DATA-IN and MASK 
v ports respectively of the CAM. 312.rThe ST field fpr each data entry irtftie .CAM^fe ;controlted ; diretf ly through 
ft© status register 328 or indirectly th^ 

in detail in FIG.jt7. * ■- - - - 



20 



25 



The speciffc.CD and SD lines Jeediiiig the OA^ 340 (MUX 

M1) by manipulating^ control hus 31 4i*The^^ from a system processor (not 

shown) and control cc^pressiQri/decdmpression functions within^AM.3^Contml;bus3 14 contains the same 
read; write, search and reset srgrials as previously illustrated in FIG^rThe:internal compressor/decompressor 
control logic within CAM 312 is also similar to that illustrated in FIG; 5i Minor modifications to this logic may 
be required to implement some of the specif ie features described ibelow. These circuit modif icatiorts are easily 
implemented by one skilled iin the 1 art and are therefore not illustrated in detail; . 

A line 326 couples the MATCH_ADDRESS port of CAM 312 to the DATAJN port of CAM 312. An external 
data bus 344 is coupled directly to the-ADDRESSJN port and coupled to therDATAJN port through register 
342. AST pattern generator line 336 and a data input line 348feed the ST field of the MASK register 350 through 
a multiplexer 346 (MUX M2). A search type signal on 4 ine 349 and various other control signals froma control 
generation circuity? are controlled by the MATGH_SUCCESS signal on line 322, The DATA_OUT signal on 
line 324 outputs .compressed or decornpressed data to data interfaces as shown in FIG. 4.^ 
pointer, 354 (NEXT^COpE) can write data to a second: address pointer 356 (SAVE_CODE) or canreceive data 
, .from the CAM DATAJN porter .. , r , 

v FIG. 17 is a detailed circuit diagram of th ST pattera generator 338 from FIG. 16. The firstbit from the 
. ; CD field and th : SD fieloVof status regist r 328 (FIG. 16) ar inputto an AND gate 358 and.an EXCLUSIVE- 
NOR gate 362. The second bit from the CD and SD fields are coupled to AND gate 360 and an EXCLUSIVE- 
NOR gat 364. The AND gates, feed th ST field of the CAM DATAJNport and, th EXCLUSIVE-NOR gates 
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feed the ST field of the CAM MASK port (FIG. 16). 

The compression/decompression system must be able to search for both CD and SD dictionary entries 
■ - simultaneously (as discussed in detail belowj.-rhis is^erformed by manipulating the bits in status register 328 
(FIG. 16). One of the bits in the Ct) and the SD will always match and the second bit will always be different 

5 (see FIG. 13). Thus, the matching bit © used to search for a valid SD or CD dictionary entry andthe second 
bit is masked out For example, in state S^O; the bit values for the current dictionary CD are [1:0] and the bit 
values for the standby dictionary are [1 :1 J. This drives the outputs of AND gate 358 and-EXCLUSIVE-NOR 
gate 362 high and drives the outputs of AND gate 360 and EXCLUSlVE-NOR gate 364 low. Therefore any ST 
field in the CAM dictionary with a "4" located in its first bit position (e.g. CD(S) or SD(S)), is identified as a 

10 valid dictionaryentry of either the CD br SD,- • ' \ : : . 

Data Compression : •= '■'"--■";.>■. r -:\ = v.^:. . i' ..■ r* 

' • Thesystem in FIG16 compressed data in the following manner. The system Is set to state S=0 by loading 
the status register 328 With bit values as illustrated in FIG. 15: All ST fields in the CAM dictionary are set to 
ST*=FREE(S) (i.fe.viO:0]) and the address pointer NE>OiCODE \s set to the first available address in the CAM 
As discussed previously for the CAM illustrated in FIG/5, the single input characters can be encoded algorith- 
^ mically dunng data comprfesskiri/ in which ^ for kofin ^ character 

- stnngs. If single data characters are stored in the CAM. however, the first available address for Writing an eri- 
code character string will typically be the address lc<atibn k after the last single c^ location. 

If necessary, afirst input character is ehccxled by reading the first data character from input data line 344 
and generating the address for the input character/data entry match on line 326, The encoded first character 
(OMEGA) is then concatenated in Register 342 With >a second input character (K) from input data line 344 to 
generate an OMEGA.K character string. A search is performed in the CAM for a data entry that matches the 
OMEGA,K string: Atthe same time; the ST field is searched for a CD or SD value that rriatches-the value gen- 
erated by STpattern generator 338 (eigj an OMEGA;K string that has already been stored asa CD or SD entry) 
; All bits of SEARCH-TYPE signal 349 take ^value^r when Searching for a match, which enable the CODE 
and CHAR fields of the CAM mask.' MUX M1 and MUX M2 select the ST fields for the MASK and DATAJN 
< ports' respectively from the' ST pattern generator 338 as previously illustrated in FIG. 17. 

•Since this is the fast OMEGA.K string fed into the CAM, the MATCH^ signal on line 322 indicates 

• no match, In turn, OMEGA is output on line 324 and the character string CD(S), OMEGA, K is written into the 
STRODE, arid CHAR fields ; respecfrvely at CAM dictionary location NEXT_CODE. The character K of the 
OMEGA;K string is then encoded (code(k)) arid used as the new Value for OMEGA. The CD(S) value written 
; into the ST field is supplied directly from register 328 by altering the input of MUX 340 which feeds into the 
"STfieldbf register 342i^^ : ^"-'i^ - ^^r^^^/s^r:;.;/, . ; - .,- . > : : 

^ The system the searches for the ^ e ritry (e.g. ST=F RE E(S)): Accordingly 

^f^™-™^ signal 349 takes the value ^/masking out the CODEand CHAR fields and enabling the 
ST field via the [1:1] bit values bh line 348. At the same time; control line 314 coupled to MUX 340 selects the 
' value FREE from register 328 as the value searched in the ST field. The match address from line 326 is used 
; ' * s ^ NEXT_CODE for storing the he*tunique GMEGAiK&n% We process extracts the next character from 
the input data stnng online 344 and concatenates it with OMEGA, generating the OMEGA; K string forthe next 
search, tf a ma^chis found on ^ 
- 'port on MATCH_ADDRESS line 326 for th^next match Attempt This addressis usedas the new OMEGA value 
representing the previous OMEGA.K string. At the same time, the SD(S) value from register 328 is written into 
is the STfield at the match address; ^' •■ v,^ .•; *■„..? = 

As described above, after a new OMEGA.K string is written into a CAM location, a search is performed 
' to find the next FREE value in the status field; A failed search indicates the current directory is full and causes 
the system toswitch into state S=1. This is performed by rotatihg the contents of ^re 

to the left The status field locations previously having SD(S) values now constitute CD(S) values; Because 
50 all the status fields in the CAM had been set to either CD(S) or SD(S) in state S=0, (e.g. no FREE status field 
values exist just prior to the state change), all FREE memory locations in state S=1 will be previous CD(S) 
ntnes from state S=0. In addition; the standby directory will be empty except possibly for the initial single- 
character strings in state S=1 sine the INV value is never written iri state S=0. Compression continues as 
d senbed abov with the system in state S=1. This process continues generating compressed data characters 
55 and switching states until all of the input data is compressed. 
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Data Decompression . j> ■.: ■ VT . 

Data decompression using the system in.F|G: 1.6 fe pe^rfpfmed in the following manner. The.CAM is initial- 
ized by resetting the. bits in register 328 testate S?=0:;The FREE bit values are wrftten into,the status field of 
each available memory dictionary location. The, mterpa! address pointer 354 (NEXT_CQDE) is set to the first 
available, memory , location in the* (^>^<e:gr^pa^raDE-, =1. 2m)--and internal address pointer 356 
(SAVE_CODE) is set to zero. » v - : , , :. ^ 

. Decompression is performed in the same manner as des.cribedaboYe-in fHG.5. For example, the first en- 
coded character from the encoded character string (OMEGA) is read qn line 344: OMEGA is then used as the 
address fed into the CAM ADDRESSJN port If the value of the CQ DE field putpujon line 324 is not a "root", 
the CHAR field is output on line 324 and the CODE field is fed back into the CAM as the next address location! 
This process is repeated until a "root" CODE field is read out from the CAM. f . , 

After a compressed input character (OMEGA) has been decompressed and the decompressed character 
string output on line 324, the ST f ield at address location OMEQA is set-to S D(S); This is performed by writing 
the ; SD(S) value from register 328 into the ST f^Mmsl^&2jjhe dictionary is then built by feeding back 
the first character (K) from the decompressed date/string into the CHAR field of register 342 at address location 
SAVE_CODE. The CD(S) va^^fromi status register 320 ( ^eOM^GA value originally read over line 344, and 
the first character from the decompressed OMEQA outputstring (K) are written into the ST CODE , : and CHAR 

;r fieys<pfthe ; CAM dictionary at address location NE^VCQpE^ The value of address pointer NE>jr_CODE is 
then written^ into Mdres^ pointer, SAV is.pigced online 349 andahe 1^1 ]^it values on 

.-Line 348 allow a "status f ield only^ search. The ,ne.xt dictionary entry 4n the CAM with- a FREE status field is 
then found by. searching the ST fields for a FREE^yalue^ The add is written 

into address pointer ^ Mne 

, -If the current dictionary is full (e.g. no JREE statusfield values, extet), the system js/switched to state S=1 
; by shifting the bits in register -328 as des^bed above: and :the is reset 

The current dictionary will therefore only contain entrie^from thepreyious standby dictionary. The system then 
reads the nextencoded character (OMEGA) from line 344 and;the^data decc^pression process is continued. 

FIG. 18 is a; data flow diagram showing the general method for data compression usingaiCAM with astand- 
by dictionary. Block 376 is an initializatfon process that setsrthe state and status cpnditjons for the system. 
Specif ically, the system is set to states? 0, all status^registers in tjie CAM dictionary, are set to ST-FREEfS), 
and the address pointer is set for the next availably address in tfte CAM (e.g. 2 m T^ IslEXT^CQDE), 

A 1 W character from an input data stream -iQI^Tt^Js.rea^d in s block ; 37£ and ienopded ,(e^:, code(K)) 
to provide the yalue OMECM; The next inprt Block 
382 combines OMEGA, and K together as a character string (e.g. concatenates OMEGArg/ A search is~then 
conducted that not only looks, for a date entry-roa^ one of two 

alternate status register patterns (SfT-CD(S) or ST=SD standby 
values since either value indicates, a valjdLcharact^^ CAM. 
.''For-e»mple t ca'^is-r^Btor>aJif8 ^^^Syh^o^Bfivai the associated CODE and C^AR,f|elds have 
been prevtously load^ during ^ present, process staterA statjus register 
value ST=SD(S> indicates the associated CORE and : CHAp f iejds^ave been loaded with an OMEGA, K char- 
acter string and ■-have^aMied^atje^to^ra .tf^ present processor state t wit^asecpnd OMEGA^K character 
string. -Thus, both status re^jtste vaiid CAM data entries that should not be 
overwritten.- :: . ; , :4 ; ! ;.,> <? . i ; i ,., -y;.\y.„ r \-.*\ \ ^ .--^ - .■ ^^> T .-. '. .. I' 

If no data string has yet been stored in the CAM, decision block 384 indicates that there is ino match; The 
encoded value OMEGA is output as the first character, in.the encoded data string in block 388. The OMEGA.K 
string fe written jntpithg f ifsfeayailable Q/SM address iocation (^^_CODE). The status field (SJ) at the ad- 
dress location NEXT^CQP^ is written ywth the yalu data! entry in the CAM. Block 
388 then replaces OMEG£ with.the encoded value of the second input character (cod e{K)-> OMEGA) . 

.Block ,390 searches the CAM for the next available address location with ST=FREE(S). If a status register 
with a FREE(S) value ia not found, the current dictionary in the CAM is full. Decision block 392 thereby replaces 
the current directory (CD) with the standby directory (SD) by changing the CAM into its next state S=S+1 mod 
4. During a stat ;change,,th values of each status register are reassigned as previously described (see FIG. 
1 3). Th ST field values are reassigned as follows; FREE-* INV^> SD-> CD-* FREE., The process returns to 
block 380, where the next input character (K) is read. The matching process is then repeated as described 
abov . If th current dictionary is not full, decision block 392 jumps to block 394. Block 394 determin s the 
next address in th CAM having a FREE status register value and assigns that address to NEXT_CODE (e.g. 
match_address-» NEXT_CODE). Th process r turns to block 380, where th n xt input character (K) is read. 
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If a match is mdicated by decision block 384; the process jumps to block 386 where the OMEGA field is 
replaced with the address of the dictionary entry that matched the OMEGA.K string. The matched character 
: ?^"9 (^presented by the CODE and.CHAR fields atthe match address) are automaticallyassigned to the 

' 5 ' 1 ' bSSJ^n? T* T?* lUS fleld 5Xat * 6 mM> addfesS t0 SD < S >« Theprocess then.returns to 
s block 380 where the next input character (K).fe read from the data stream. The match address (OMEGA) and 
K are concatenated to form a new OMEGA.K string which now represents threeinput characters. Block 382 
then searches the current and standby dictionaries for a character string match .._ _ 

Va £ rt r lb ^ i,din9 *^ 

,„ f A -^ | f ,ar?ct ^ st " n 9 "^C". Met to SD(S). This will often be an -overkill", since this location.might already 

„ 3 d?ta showing the general method for data- decompression using a CAM with a 

standby dictmnary, Block 398 jnitelizes^systern tQ^tate (S=0). and initializes the ST; field for all available 
" ^ •«» the firstfree address 

lpcat,on (NE^CODE^I) and* second address pqinterSAYE^CODE is set to zero. The first coded char- 
. acter fromtheicornpressed data string (OMEGA) is read in Wock 400.. : .-<•-. 

; ■ y Block 401 decompresses OMEGA into a decompressed character string W as described above in FIG 16 

' 

v„:CAM IffteCC^ 

' . 2* ™ ~» " eXt address is then°utputas the ;next decompressed character It If the CODE field 
at address OMEGA is a VpoP. the CHAR field at^dress OMEGA is output and the CODE.CHAR fields at 
address OMEGA are assigned to the standby dictionary (e.g., SD(S)^ST). Block 402 assignsthe first char- 
-; acter of character string yy:.tp,aragjstBrA^.-v:i;! iM:'- ,• .. x -' -• , 
'f&f* addfeSS w ^^ SA VExCODE is notzerojd^cisionb|ock4b3 where the dictionary 

iSS^f!^-^ 5AVEXODE.C into the GAMdictionary at address location 
, , ; (NE^_CQDE)^ lf SA^CODE is equal to zetp 

, '■*aM»toMm 1 WAM-*klnmto^WZ!&*m standby dictionary (SD(S) ~> (OMEGA)) and re. 
plac^ th^p prMent.SAVEXOPE value with the . value IOMEGA BlQt* 406 searches for the next status field 
. with a value ST,=FREE(S).lf : aJ^REE ST field is located decision block 408 jumps to block 410 where the match 
address ,s assigned to address pointer NEXF i CO.DE^(e i g. MATCH^ADD-* NEXTCODE). The process then 
...retvrns to block 400 where.the ; next encoded ; character from the compressed data stream (OMEGA) is read 
■:■ ano.decompressed. :.v : ~ -y\- - : w ,. ' 

' '""M'^t&j^f^te* S-FWECS^yahje, deejsi0ri;bloc^^O9:ju9ip9 to block 412." The process is then changed 
''• 5 £ x! Wxt.e)ate;caMeln9.tbe «vmtn> dictionary to be switched.with the standby dictionary (i.e.. S=S+1 mod 
, ^hi9 also rauses the current dirtiOT^ become FREE locations. Block 

^searches for the next free^ocatjon with ST=FREE(S)^resets the value of address pointer SAVE CODE 
! V^f" !!! d iUmpS * • bhK *- 4m ' &Q *A™ ^igns Headdress value of the FREE, location located in block 
4 3 to address pointer NEXT.CODE. Block 410 then returns to block 400 where the process continues until 
.• all the data from thexompressed data stream is decompressed; v , , 

>.u . - FIG; 20 is-a graphical depiction of the compression and decompression algorithms in FIGS. 18 and 19 A 
-raw datastream41*comprisesan uncompressed string of characters which are input to the CAM compression 
process illustrated in FIG. 18. In this example. single. characters R.I. N. and T havebeen loaded during initial- 
: izat»n into locations ADDR0. ADDR1. ADDR2. ADDR3 of memory 416 respectively. Single-character inputs 
are encoded by assigning each character the value of its address location, however, to increase compression 
speed, single-input characters can be encoded algorithmically prior to initiating the process described below. 
Mernory 416 illustrates the dictionary in state.S=0 immediately after initialization and memory 418 illustrates 
v*lte dictionary instate S=.0immediately before replacing the currentdictionary wrth the standby dictionary (e.g 
;Chang.ng from slate S=0 to state S=1), Memory 420 illustrates the dictionary in state S=2 after compressing 
raw data stream 414. . 

h E !> Ch y 6 ™"* location In the dictionary is separated into a status field (ST), a code field (CODE), and a 
mij u i?^^' ''lustration, purposes, it is assumed that there are only 5 dictionary locations in the 
CAMava.lablefbrstonngcharacterstrings(e.g.ADDR4rADDR8).Addr ss locations ADDR0-ADDR3 are des- 
gnated for single characters and are not search d as available dictionary locations; The bits of ach status 
f .e d are .ratiateed to a value FREEKOiO] (e.g., FREE(S)=0) and an address pointer N EXTjCOD E is initialized 
to tn first available CAM memory location (NEXT_CODE = ADDR4). 
j Thefiistinputcharader'RT.fromrawdatastream414 < matchesth' data ntryataddr ssl cati hADDRO. 

with th next input character "!-. (OMEGA.K ) and searches for a "OP match in th CODE and CHAR f i Ids in 
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memory 416; At the same time, the coiTespondrngSTfield in memory 416 is searched for the bit Combinations 
"1:0- or M:r (e.g. CD(S)*r SD(S) in state^O): : Antmemory locations are FREE and no "01" strihg has been 
; previously written into memory, therefore; no match wiii be found. Therefore, the value of OMEGA<"<T) is output 
as the! first character in a compressed stream 4^2 and^the character string (CD(S). OMEGA, K) is written into 
5 the first FRiEE memory location (ADDR4). The eliaracterVT is then encoded: generating the next value for 
• OMEGA (e.g. OMEGA=1). -.. ^ . i. - 

The CAM dictionary is searched for the next ST field with a FREE value and that address location is as- 
signed to the address pointer NEXT_CODE (e.g. NEXT^CODE=5). The next character from the raw data 
■' : stream (K="N") is then concatenated with' OMEGA (dMEGA="1^ and the CAM is searched for the MIST char- 
to acter string. Again no match will occur in the CAM and^the character strih^GD^S), N) is written into address 
location ADDR5. The process is repeated in the same manner writing into the ST, CODE, and CHAR fields of 
the next available f address after a character strihg 

The f irst character sfringj/data 'entry match occursVfrG^ the com («IN«) f rom 

the raw data stream 41 4i Character; W tne 
is CODE arid CHAR fields of address location ADDR5 'are^r and "N* respectively',- a rid the-status f ifeld was pre- 
viously set to ST=CD(S), a character string match occurs: The match addreWis'used as the new OMEGA value 
(OMEGA=5) and the data entry at ADDR5 is assigned to the^stahdby dictionary *ST-SD(S)=[1 ^1 Jfor S=0). The 
next character ?V is read from the faiW^data streak 

string { n SF) t which now represents three' characters; is then searched as previously described. No OMEGA, K 

20 string with the val^ the next available; address locatiori (ADDR8). 

The encoded character ^ isfoutput to ^ cbmpressedt;haracter stream 422 arid the encoded value'tor^T" is used 
' as the next OMEGA vaiue^(OMEGA=3)V \) •■^■^i->« - ryj::. ■■>-• ^y^s . 5 -«c- ;Vj ?>Ui-;> :j.~-«\h?*- 

Memory 418 illustrates the status of the CAM immediately after writing the^character string *5F into ad- 
dress location ADDRBrThfc'p^^ 418 for me next FREE status field; Assuming ADDR8 

25 is the iastavailabte ibcahbn in' t^ 

me current dictionary is full arid the isysteri* is ^ a iccordihgly changed to state S=1 - in state S=1 .the status field 
'bit values [1 ^ constitute a ^ currerit- dictionary entry 

(see FIG. 13). Therefore^ all dictionary : lo&tidns tone J current^ictioriary in state S=1 , exceprthe' character 
string at address ADDRS.are <^ change,- the* address pointer 

30 NEXT_CODE is reset to the first FREfcme^ ■■: ,-. 

Referring to memory 420 in state iS^rthetoext input character 426 (?H is then extracted from raw char- 
acter stream 414 and concatenated with OMEGA for the next OMEGA.K search ("31"); The string "31" resides 
in memory location ADDR7;howeVetr^ FREE^Therefore, no match is 

found, and the encoded value "3* is Output as > character' 438 irt carnpressed-character stream 422. The char- 

35 ; ader r s : tring (CD(S), 3j I) is written* intd^m^ character T% encoded as the Viext 

:is tei : (NEXTiCODE=6): Note that address location ADDR5 is skipped because its status field indicates' a current 
dictionary entry after switching; frbmstate S-0 to state S-1i;C^ ^0"<; fX-S- l ; 

The next input character 428 from raw data streanr4t4 is concatenated 'with OMEGAcomprising the new 
to character string piN^. A match 'Occurs at address; location ADD R5^ and therefore OMEGA is assigned the 
•■•match' 'address Value; and the status field at address location ADDR5 is assigned to the standby dictionary. 

The bit assignment for the istaridby dictionaiyjn state S==1 are [0:1J (See FIG. ;1 3) -The next inputcharacter 
-■ fronvraw data stream 41 4 is concatenated with OMEGA and the search process is repeated. The process con- 
tinues to changethe state of th&system each time the current dictionary fills up" until all the characters from 
& the raw data stream 414-are ( compressed; ;, * , *'."*" * •'■ -\4 ■■ ;• .? 

Memory 432 illustrates the ^memory ready for ^ decc^presston imnrtediately after initialization for decom- 
pression. Memory 434 illustrates the system in stafe. S=0 immediately before changing from state S=0 to state 
f S=1. Memory 436 illustrates the d^ 
stream 422. The dictionary in memory 432 is initialized so that the first four address locations contain the de- 
50 coded values forthe singleinputchara T respectively. Again, single' character decoding may 

also be performed algonthmically. The: system is set to state S=0 and all dictionary status registers are set to 
■ " ■ FREE(S). The address pointer NEXTCODE is set to the first available dictionary location (ADDR4) and the 
• address pointer SAVE;_eOD£ is ^ set to^zerov:-: s-. ■■'■/^f.:- . ^ ; 

- Decbmpressiori is conducted as described, earlier, whereby OMEGA is used as the address pointer into 
55 memory 432. The first input code from compressed characterstream 422 cohstitut s an OMEGA value (OME- 
^ : GA=0). The decompression system d termines that the value '0° is a root cod word,f r xampl ,bych eking 
that the value is I ss than 4/ The-data entry atADDRO ( ;g. "R^- ^is thereby output as the first character, in the 
- : decompressed character stream 430; The^status f i Id at address location OMEGA is then set to SD(S). 
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The dictionary is rebuilt by writing the first character K from trie decompressed codeword (e.g. '"R") back 
into the CHAR field of address location SAVE_CODE. In this case, "R" is rewritten into the CHAR field of 
ADDRO. The character string (CD(S) t 0. R) is then written into address location NEXT.CODE (e.g. ADDR4) 
and SAVE_CODE is set to the value of NEXT_CODE (e.g. SAVE_CODE=4). The address pointer NEXT_CODE 
is then assigned the value of the next free address in memory 434 (e.g. NEXT_CODE=5). 

The character "1 " is read from compressed character stream 422 and serves as the next value for OMEGA. 
OMEGA is decompressed arid the decoded character"!" is output as the next character in decompressed char- 
acter stream 430. The ST field of address ADDR1 is set to SD(S) (e:g. [1:1)) and the first character from the 
decompressed OMEGA value {T) is written into the CHAR field at address location SAVE_CODE (ADDR4). 
The character string (CD(S), 1,-1) is then written into the ST, CODE, and CHAR fields of memory 434 respec- 
tively at address NEXT_CODE (e.g; AbbR5). ! The value of NEXT_CODE is used as the new value for 
SAVE_CODE. The next FREE* status* ^ to that address 

(NEXTiCODE=5)/ >•■'>:' .v: v^;,:- ^ j c ^ ;:-:«:*•: .•: .. -y... •. s 

The process' dbhtihties in a similar manner for encoded characters "2" arid "3" from compressed character 
stream 422; The firsts from compressed character strearn 4221s the first nori-root code word, and the data 
entry at address ADDR5 is the character string* therefore, the GODEfield at ADDR5 frj is fed back as 
the next memory locatioaread by the CAM. The output at ADDR1 fl*) along with the previous CHAR field "N" 
are then output by the CAM. and the ST field at ADDR5 is set to SD(0). The first character from the decom- 
pressed codeword ("H is written into the CHAR field of memory location ADDR7 (e.g. SAVE_CODE=7), the 
character string (CD(S), 5, I) is writfienaritO CAM location NEXT_CODE (e.g. ADDR8), arid the value of 
SAVE_CODE is set to the' value of NEXT£CODE (e:g. SAVE_GODE=8). Memory 434 shows the status of the 
current dictionary immediately after writ - 
• : ' ; The next search indicates no status "field cohtai ns a FREE Value: therefore/the system is switched into 
state S=1 and the status register 13. Referring to memory 436, 

the address pointer NEXffCODE is assigned the > : 1 irst FFu;E memory location (AE)Dr4). Address locations 
ADDRfMDDR3, and ADD R5 are; now entries in the current directory while address locations ADDR4, and 
ADDR6-ADDR8 constitute FREE locations in state S=1 . Criaracter i 438 from compressed character stream 
422 is set to OMEGA (OMEGA-3) arid ft ecornjpressed in the? new decompression state S= 1 . The decoded char- 
acter T 1 ^ is output in decorrip^^ 430 r arid the ST field at ADDR3 is assigned the value 
SD(S) for state 5=1 {e.g. (1:0]). SAVe£cbbE;points t6 : ADDR8 so the character T is written into the CHAR 
field at ADDR8 in memory4^ address location ADDR4 and 
SAVE_CODE is assigned the value N EXTlCODE. 'The next FREE dictionary location is ADDR6 and accord- 
ingly is assigned to the address pointer NEXT_C<3DE: The process continues in the sarne manner until all char- 
acters in compressed character stream i 422 are decompressed. - - > ; 

: ' j In traditional IZ2 implementations^ codes are assigned sequentially, With single-character strings being 

assigned codes in the following order C 0t Cq+1. Co+2 Co +(2 m -1) where Co is some small constant (e.g. 

Co=t));Theriew^ (2 ni +1); ;-;";/2 l *-2, 2 b -1,in that order, 

where each subsequent character string has a sequential address value in the CAM. Hence, assignment of 
bodes to strings is achieved sirriply by keeping a counter Initialised to Co+2™, arid iricrerrierrting it every time 
a new dictionary string Is created. This allows trie compressor to use variable length output codes, using codes 
6flength m+1 after a dictionary Ves the length of the output code by one bit 

every time the number of entries in me di^ Therefore, the length of the 

output codes vary between (m+1) andb, where 2 b is the ^ maximurri' size 7 of the^ dictionary. This yields some 
gain in compression ratio; since the compressor tjses shorter output codes when the dictionary address code 

7 is shorter. The decompressor builds its dictionary in lock-step with this compressor, arid can keep track of the 
expected length of the compression codes. - ; ".\ . - ? 

In the process illustrated in FIG. 16, the encoded value for a new character string is the address of the 

• f irst FREE dictionary location. Immetliatelyaf ter a dictionary switch, the CD consists of character strings from 
the previous standby dictionary with locations in the CAM that are not necessarily contiguous. These strings 
preserve their old addresses and thereby their codes, after the switch. Therefore, the addresses (codes) re- 
turned by the search for FREE do not form a contiguous sequence: Also, very encoded charact r string C in 
the range 6 ^ &s 2M is potentially available ^ immediately after dictibriary reset 

As a consequence, th ^output stream imust use fixed length codes. Howe^ impact of this 

on compression ratio is not significant 1 Since the CD starts partiarty filled after a dictioriary reset, even if the 
codes in the CD were reordered, the number of bits r quired to represent codes would not be far from the max- 
imum bit length (b). For xampl , in xperiments, it was found that the current dicti nary typically starts be- 
tween 1/4 and 1/2 full. This means that b-1 bits would be required after the switch even if trie codes were 
align d in contiguous order. It is possible, however, t use a variable length cod during the building of the 
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first current/standby dictionary (CD.SD) or the current dictionary could be reordered after each reset 
. Compression Results • ' ' '"' !.. ' 

the compression and decompression procesMs jor the GAM multi-dictionary systenvwere applied to va- 
rious types of data, including source code, executaWe^object code, ASCII data files, test files, and bitmap im- 
age files, The same files were compressed with atradJUonaJ LZWscheme using variableiength output codes 
.; An overall resultof the compressions are shown in FIG. 21. Line 440js the graphical representation of the com- 
pression ratio for the CAM multi-dictionary system and line 442 illustrates the compression ratio for a standard 
L?Vy algorithm. Lines 440 and 442 plot compression ratio (origfoaj file size/compressed, file size) as a function 
of b, theiraximum number om 

... ,Jo emphasize the advantage/* the CAM/stendby>ctiooary method, adashed line 4^ is draw in the plot 
at the compression ratio achieved by a 12-bit LZW compressor. The value of b for th^pAM tnulthdictionary 
process achieving thesame compres^oaratio is therUpcated^As illustrated in FJG. 21,-the-CAM multi-dic- 
twnary .system provides ^.P^<mm^m»m^PJ>o. : m^f the.number of dictioriary^ntries as the 
standard 12W con ) pressor ; (e,g^pne less bit =1^ 

.eved with CAM-dictfonary entries thatare ; pn|y :: 1 flr i bits longer tnan.a«>nyentional LZW compressor data 

entry.;,. • - A .... ..I, ^- . ;• .. • - . .. „--..-.. ■ .; : . - - '. 

For clarity, a minimal implementation, of the^ndby dirtbna'ry,?iheme has been ifestrated' lvlany mod- 
if icatipns can fee ; imp!em.eot^»tp,fur^erJncreas» ( tt)e compression ratio. For examr#, me t cc^ression/ de- 
compression process illustrsted in FIGS. dictionary with a set of all 
single-character strings Jo 4he input al^^^ initialization could be 
usedas previously described. A process.i>ased on;a f cpn)binatjqh, of intermediate initialization- and standby dic- 
tionaries can also produce high compress 

.! s , jAn additipnaj. method for in^ementing the,syste^ the cur- 

rent dictionaryf. lis up. Instead, tfw : e^rrentdjctipnary ; .is frozen, and a dictionary switch is based, op the com- 
pression ra^ fjlling. below a 4 certain, jteyel)^ standby 
dictionary can also be frozenxft it cajrv continue .tp.be, built until, the.next dictionary s^tcfc <, > ~ ■■ 

Another niodification,.specif fe.to the.^ |NV 
denoted in FIG. 13. Currently. INV is nqtused^t^ be used to define- a second 

level of standby dictionaries,. denoted S02,An^ is already labelecj.sb. upon being refe^nced more 

than once, would be changed to SD2>(a new namefpr-the current |NV. value). At dictionary switching time CD 
entnes .would become FREE, SD entrj^b.ecome^.D entries. an'd^pZpntrie? become SD entries. The first 
standby dictionary (SD) would be sta^^^^pf,d^ct^ strings in SD2. and a*new. SD2 would start 
from scratch. This modification is easily impiemented &i the system illustrated in FIG. 16 ^y one skilled in the 



Thus, a variant of the Lem^^ 
•tionary in parallel with, the current compression ^tionary.^en.ihe curtent.dictiorary fjlls up, the standby 
■ dictionary replaces it, and a new standby dictionary is sJarted.,The s taqdby, dictionary . contains a selected sub- 

. ,sej qfthe ^ wrM-aHp^for thaimplementaWo^lbpth dictionaries on 

,tne same mempry buffer. The preferreq^syste a content addressable memory module 

To reduce .processing time *nd;circuit state^ transition 

^scheme whfch eliminates the need fo* olctjona^ CAM. multi-dic- 

tionary cornpr^ wjth tradi j iona , data com _ 

.pressipn implementetiqns, using only.afractrqn ; of the memory, and .with only a moderate increase In thercom- 
plexity of the control circuitry. .'. \ ' '"'„J.^ : . .r V 

^Selective Overwrite Method of Data Compression/Dec^ in a CAM-Based Multiple Dictionary 
System , y t ... ... .... ; t . : — — : — ; • - 

^ Asecond LemMr^y Standby^ data compression and deaimpressioa method is now 

descnbed that uses the c^mpr.ession/dewmpressipn system previously shown in FIG. 14. Thelisp2 allows 
all dictionary entries^ b used for character string matching at all times/ By filling availabl CAM storage lo- 
cations with encoded data entries and keeping each available storage location assigned to a dictionary, re- 
duced compression performance typically occurring after a dictionary swap is eliminated. Therefore, overall 
system compression performance is improved over multipl dictionary swapping schemes that discontinue us- 
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LZSD2 Compression r ■ ■ , . : . 

Three dictionaries are used for LZSD2 compression. The Current Dictionary (CD) holds new strings and 
strings demoted from the Standby Dictionary. The Standby Dictionary (SD) holds strings that have been 
matched by a dictionary search and aretherefore ."good" strings. There are also alternative criteria for assigning 
data.entries to the standby dictionary. The FREE Previous Dictionary (FREE/PD) contains storage locations 
not currently assigned data entries and data entriesLwhich have been demoted from the-Current Dictionary. 
Data entries in the {FREE/PD) dictionary are also selectably overwritten with new character strings: When the 
FREE/PD locations are filled tip; the GAM changes state, in turn, creating in effect a dictionary swap. 

The dictionary swap changes the priority in which data entries are overwritten with new input character 
strings. For example, data entries in the CD are demoted to FREE/PD and data entries in the SD are demoted 
to the CD. Therefore, after a dictionary swap, data entries previously assigned to the CD and which were not 
subjectto beingJdverwritten with'new character strings;;are now demoted to FREE/PD and are subject to being 
overwritten with encoded character strings.' . « ^ t . . - iv; / .: ; = . / 

•r.s : Referring to FIGS:<22A-22E; complete arid coriUriuous utilization of afl dictionary space is carried out gen- 
erally by searching the three dictionaries SD. CD and FREE/PD at the same time. Initially, all CODE and CHAR 

: fields in each available storage location in the CAM'are reset to a known value typically null and assigned to 

= 'the FREE/Previous Dictionary (FREE/PD) as shown In FIG. 22A. Available storage locations refer to address 
-locations in the CAM that are available for storing a data entry, A dictionary data entry comprises a string that 

* includesPREVCODE N;:which is the address of the best dictionary, match that has been found so far and CH 
N which is the moist recent character from the in put data streamU ^ . . ; . 

Afirst character string (PREVCODE1,CH1) is stored in the first available address location (ADDRO) in the 
FREePD; dictionary and assigned to the Current Dictionary (CD); The next unique character string (PRE- 
VCODE2, CH2)iis stored in the next available FREE/PD storage location (ADDR1) and also assigned to CD. 
Character strings (PREVCODE3,CH3) and (PREVCODE4.CH4) are stored: in the CAM at the: next available 
addresses ADDR2 and ADDR3, respectively and both are assigned to CD/ 

Referring to FIG. 22B, new unique character strings continue to be stored in available FREE/PD storage 
•locations and^assigned to CD. If the compression process receives a new characterstring that has already 

. been stored in the.CAM as a data entry, arid not overwritten, the data entry is reassigned or promoted to the 

. ■standby dictionary (SD). For example, data entry (PREVCODE1.CH1) has previously been stored at'address 
location ADDRO.^Therefore, if a new character string containsthe (CODE, CH) values (PREVCODE1.CH1), the 
data entry at ADDRO is reassigned to SD. / : ^ ''-v^ -- 

FIG, 22C shows previously stored : data entries, (PREVCGDE1.CH1), (PREVCODE3.CH3), and (PRE- 

.;VCODE4,CH4) assigned to the standby dictionary since each such data entry matched a new input character 
string: The CAM remains in the present state until each available CAM storage location (e^i the FREE/PD 
location at ADDR7) is filled with a data entry assigned to either the CD or SD as shown in FIG.22C. FIG. 22C 
shows the last available FREE/PD location at address location ADDR7 replaced with a data entry prior to the 
dictionary swap; .: -r-.y.: . : . : ., ■ _ ,/ 

After all FREE/PD locations have been assigned dat^entries* the CAM changes state, causing a dictionary 
swap. FIG. 22D shows the status of each data* entry immediately after the dictionary swap! For example, all 
data entries previously assigned to the SD are reassig ned to the CD (SD->CD) and all data entries previously 
assigned tathe CDare reassigned to FREE/PD (GD^»FREE/PD). It is important to note that after the dictionary 
swap all data entries remain assigned to a dictionary. There will be no standby dictionary entries after the swap. 
*For example, the data entries previously assigned to the CD at address locations ADDR1, and ADDR4-ADDR7 
are reassigned to FREE/PD. Therefore, all data entries remain available for character encoding after a CAM 
reset Therefore, data compression performance is maintained since no previously encoded compression data 
is lost after a CAM reset - : . . 

The LZSD2 method also has me capacity to adapt for new input data by selectively replacing data entries 
assigned to FREE/PD with new character strings not^reviously stored in the CAM. Specifically, if a new char- 
acter string: matches a data entry in either FREE/PD or the current dictionary (CD), the new character string 
is reassigned to the standby dictionary (SD). For example, in FIG. 22E. the next input character string (PRE- 
VCODE1.CH1) matches the CD data entry at address location ADDRO. Therefore, the data ritry at ADDRO 
is reassigned to SD. Furth r, the input charact r string (PREVCODE5,CH5) match s the data entry at address 
location ADDR4; Therefore, th data entry at ADDR4 is reassigned from FREE/PD to SD (PREVCODE5 CH- 
5.FREE/PD) (PREVCODE5,CH5,SD). " : 

,f an inDUt character string does n t match any existing data entry, a n w character string is put int a 
FREE/PD dictionary location and initially assigned to CD. For exampl , th input charact r string (PRE- 
VCODE9.CH9) does not match any data entry in the CAM. Thus, (PREVCODE9.CH9) is writt n into the CAM 
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storage location of the FREE/PD with the lowest address (i.e., ADDR1) and assigned to CD (PRE- 
VCODE9.CH9.CD). It is also possible to assign data entries based on criteria other than lowest FREE/PD ad- 

- -dress location. If the same string (PREVSCODE9.CH9) recurs before a subsequent reset and overwrite, this 
entry would then be promoted to SD.: ; ^ v^ - : \ .-. - . ; > 

; ; It cani>e seen that all data entries in the JCD vdictionary and those in the FREE/PD dictionary that have 

, not yetbeen overwritten are stiU utilised at all times fohcharacter string matching. Since all data entries remain 
assigned to a dictionary<after a CAM state change, ho compression ^ information is Jost Thua/ the dictionaries 
can be continuously updated .without temporary, degradations in, the data compression rate. ^ 
, There are several methods for selecting the data entry in the FREE/PD that is overwritten when a new 
inputrcharacter string* is identified in the compression process. As discussed above, a new character string 
that doesnot match any data entry in either the FREE/RD,COvor SD is overwritten into the storage location 
in FREE/PD with the lowest address. When thedictidnaryJsinife the address mono- 

tonically (as compared to using a hashing scteme^^^ dictionary 
entry. This situation will remain true however, only until. ail/FREE^D.Iocations have'.been overwritten once 
Alternatively, individual data entries in the, FREE/PD dictionary can alsorbedeterrhinishcally selected for re- 
placement vMth;newjnput*chai^er string's^:- *V : .-::rr. . , t . , » .' ts , 

For examplc data entries caji .be ovenwitten.iii FREE/RD according to how tong/the prior data entry has 
resided in a dictionary, In the example;/ea^ 

it was written into t a CAM storage location.rTiie tZSD2: search process thenselects the RREE/PD data entry 
with the tag value indicating it was ieast recentty.used (LRU); The least recently used data, entry ;in FREE /PD 
would be the data entry that has resided in the.CAMfor the largest amount of time without matching an encoded 
character string. ..<v r — : ....... , •; > ; r^C::Y.T& ^ f.rtf. U <-:j^:; -Ha 

The LRU data entry; in sonrte(situations mayiiaye^the highest probability of not matching ;a rtew character 
string. Therefore,overwritiagthe.LRU data. entry has the. potential^f minimizhig ahyidss ofcomptession in- 
formation that could occur when an ; existing riata^entry^ data entries 
is described in detail by Bunton and Borriellain ^PRACTICAL DICTIONARY MANAGEMENT FOR HARDWARE 
DATA COMPRESSION, Ck>mmurii<^tions of tjre ACM, January:! 992;<Vol 35^No;;1u .<;>-.: v 
: FIG: 23 is a data flow diagram showing: the general method for USD2 data compression;^ 

-siort/ decompression system sh 
(CH) from the input character string arerread We atMime imblock 452: If an : End of .File' (EOF) condition is 
identified in decision block 456, decision block 454 then checks tesee whether it is the first time through the 
compression cycle. If the EOF condition is encountered the firsttime.through the compression cycle, decision 
block 454 ends the L2SD2 compression process. If it is not the first time througtrithe compressibn cycle when 
containing the.EOF condition is detected^ btoefc^echcutputs the previously matched seoiiende. PREVCODE 
and: block 464 provides additional cleanup for ensuring proper formatting! of encoded output characters^. 

Refemng. bade to: decision blo^ block 458 .searches all three 

dictionaries (U. (t FREE/RD, CD, and SD):fbr!the^xtended>stringi{PREycODE,eH)i tethe (PREVCODE.CH) 
string is matched with a previously stored data entry, decision block 462 jumps to; =block :472; If (PRE- 
VCODE.CH) is not already in vtaa SD, - block 472^eassigns the fGAM.rlocatiOni with the ^matching (PRE- 
VCODEiCH) data entry to the-Standby Dictlonary;;The matching data entry is reassigned into the SD by chang- 
ing the status ; bit& The^*^ of the matched 
data entry and assignedtoPREVCOPE;^ then jumps 
to,blpck 452 where.fthfe^nexHnput char^^ valgeiof (PREVCODE). In this 
way, the. status bits Jor^the stpred epdes for.each substring within a. matched string, are updated.so they will 
beretained in the subsequent reset ■ , v :: ^/r-WT v / ^ ; , r ,i. i ,. 1 .. 
A . Once a string has been extended to ; the point that a ; match does not occur-in decision block 462, block 466 
outputs PREVCODE as the best match found. If there is an available FREE/PD location, block 468 updates 
the dictionary by writing (PREVCODE.CH) into the next available address (e.g., FREE^D dictionary with low- 
est address) and assigns; it to the Current .Dictionary.(QQ); If there are. no available FREE/PD locations, block 
468 updates the dictionaries by swapping, the current dictionary into the FREE/PD. dictionary and swapping 
the standby dictionary -into the, current dictionary , by t changing the- status ? bits, that is, (CD^FREE/PD. 
SD^CD). Block 470, prepares forthe next input character string by assigning. CH to PREVCODE {CH PRE- 
VCODE), There are alternate mappings from singl&character strings to compressed c des. The compression 

: process then returns to block 452 to read and combine the .next input character CH with. PREVCODE. 

FIG. 24 is a detailed data flow diagram for the LZSD2 compression scheme shown in FIG. 23: The following 

-. variables are used to describe L2SD2 compr ssion and decompression. - : 
CAM, ; , Content . Addressable Memory;,- Each dictionary .entry contains (JMAXBITS bit code 

k ' ./.fi8ldl.W bit character field], [2 bit status field]);; ;:,; lt ^ ■ :o ■■ . .* 
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. CH 

. CODE^SIZE 



DEPTH 



EOF 



Two bit Status value which indicates that the dictionary entry is in the Current Dictionary. 
Eight bit variable, which contains the most recent input character. 
The number of t?its : currently ; used for each output code. The minimum is 9 and the max- 
imum is MAXBITS which is determined by the dictionary size. 2*"^^) >= (Number of 
dictionary entries) + (Number of root codes) (typically 256) + (Number of control codes). 
Variable which contains the number of characters in the string represented by PRE- 
VCODE during compression otlNCODE during decompression. The size of this variable 
is determined by MAXDEPTH. - 

Aflag which indicates, when set, that an attempt to read data from the input stream failed 
because the end of the data stream was reached. 

Eight bitva^He i whfcA ( P3^iro the first character of the string represented by INCODE. 
* ne ^W£re?y the input data string, and 

: ^.rnatch is fWQVjth^^^ the address at which the match 

was found. ; - ,- ■ -•••;:.«,::,■ 

Two bit Status value ^ff^rundicates that the data entry is in the Previous Dictionary. It 
. also, indicates that the lo^tion c^n be overwritten, . 

A CpDE_SIZE bit control [ code which signals the decompressor to start reading one more 
, bit fpr : each compressed code. ? - - .;.?;=. 
MAXBITS bit variable whose value is read from the compressed data stream; 
Thjsis any MAXBITS. bftxq0e that is npt a dictionary entry, .i.e., INVALID may represent 
-. a control code ora rc<)tcp^e^> ;t ^ - ■ --v- ' : ,7, 

i^^C^^BjJ^^. M^BIJS bit variable .^.i^^ntains the address of the most recently built code. 



FIRST_CHAR. 
FOUNP_CQDE 



- 15 = FREE/PD 



GRQVV 

INCODE 
INVALID/ 



MATCH 

MAXDEPTH 
NEXTCODE 



VPREVCODE- 



PREVDEPTH . 
SD " y ] 
SJACK 
SWAP_FLAG 
TCODE - : , 
TDEPTH 



fThisjndicator is.Jtrue if : a:search of the -dictionary succeeded in finding a match. 
, ^Max|mym number of f . .. , 

The, maximum ,sin^ l ^iigth-jtt^(a codeis allowed to represent ; t ; ■■. 
.MAXBITS bit variable [which. ^cpntains the addn3ss of thedictionary entry that is to be over- 
..r written with ; Jhe jRew-.c^i^^^string..,.':--,; t , . 

. ^A^fTSM ^ address of the best dictionary match that has 

beerj found $o far during 4 expression. Durjng decompression PREVCODE represents 
what INCODE was during the , previous; cycle. 
The string length of PR^CQDE.during decompression only. 

T wo.bitStatus value whic^MicateA-tiiat the dictionary, entry is in the Standby Dictionary. 
An, eightbit by MAXDEPTH LIF^Q queue used for string reversal. 
Indicator flag that is true when a dictionary swap is needed. 
. A MAXBITS bjt yariable.used as a temporary storage location when decoding INCODE. 
v Tem STACK depth while it is being emptied. 

. . r Referring to FIQ. 24A. ats^up, block ^ 

i n F'fMjn,? ^own, : consistent state. Each dictionary entry is set to a predetermined value. For example, the 
: ;: code field, character field, _ and status fielcte.are typically set at; 000 Hexadecimal (HEX), 00 Hex* and FREE/PD, 
; T^pectwely. Thus (; every dicjtionary : the two character string (NULL NULL 

,'FREE/Pp). It is also possible to initialize each dictionary entry to different string values to further increase 
, the compression ratio, For example, characjerrstring combinations .that occur frequently in the input character 
: , stream can be written into the FREE/RD dictionary prior to beginning the compression process. 

Output format control is carried out by, the compressed da^ are reset 

; v by block 474 to an empty initial state, The CpPE>SI2E variable/register is typically set to minimum value such 
, as 9, the LAST_CpDE^BUILT register is set to INVALID, the PEPTHxegister is set to 0, and SWAP_FLAG is 

,.-..Unset . ' ....... j . .... • t -j ^ •. . 

.Block 476 reads an eight bit character from input characterstream and assigns it to varia^ CH. De- 
cision block 478 determines whether, a data read failure c<curs ; due to reaching the end of the input stream 
(Le., EOF flag). If an EOF condition is detected, decision block 486 ends the compression processed outputs 
any remaining encoded information. If the data read process succeeds (Le;. no EOF condition), decision block 
.478 continues; the LZSD2 compression process. 

When a data read failure occurs, decisi n block 478 jumps to decisi n block 486 where th string length 
of PREVCODE is checked. If DEPTH = 0, PREVCODE has a 0 string length and cannot be output Since the 
EOF flag also indicated that the end of the input data stream has been reached, DEPTH=0 indicates that there 
is nothing left to output; c mpressipn is finished and block 486 ends th compressi n process.. DEPTH = 0 
only after initialization which means that no data was input If DEPTH > 0\ decision block 486 jumps to decision 
block 492. ..... , 
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DEPTH - O^rrnp h ? ' dec ; s ' 0 " b, ? ck 4 « ch ^ fe *■?<!• * the DEPTH variable/register. If 
Jon • P REVC0DE has 8 0 string length 'and cannbt be used in dictionary searches. Therefore block 
480 assigns the ,nput character CH to PREVCODE'fPREVCODE PREVCODE now reSSa one 
c^cterstnngsoDEPTH^ 

488 S ItellT^* ! valid cHaracteV string arid the compression process continues to block 

th^S , CH the f ield (PREVeODE ; CH). The simultaneous search of a« 

stetl fSr^e 8 18 fy^mply searching only tte code ^^character fields and disregarding the 

, too " e - Th.s -s done byusing a priority*,^ the |owes , 

value When alocatioh »f^-th^i n ^^ ( ^ m ^m^ the MATCH f^assZS 
match address to FOUNDCODE. and goes to decision block 490 1 ' «* - 9> 9 

locating a<Pr«VcODeCH)%r^-1n;<fe^hWy' «M ei^ib^ennine whether PduNDCODE 
•s an acceptable output code^wo additional^stS ; must also be passed in decision block 490 First the de- 

FoTS^ ^ the compressoriTh'erefore. 

fm!!f- cannot be equal to I^ST_CODE_BUfLTWhich'preverits them recently built dictionary entry 
from being used as an output codes; • •-■'•<} :: v ^ ;. . : 

longer than MAXDEPTH were output; the decompressor string reversal register (see FIG. 4) would overflow 

matches the ( PP£VCODE,CH)^^ 

' ' S c r ^ Re ^ Bloc^thWsets PRE- 

.k n«™ I. be * ,strin 9 match ^^sofat- rVarifely roUrtoCODE<PREVCODE=FOUNDCODE) and 

vSrtnt ? J ^ °^ 0Ck 476 * the'rVeict' input'ctiaracter (CH) is read and appended to the new PRE- 
VCODE value creating the new string {PREVCODEiCH):' 1 ' ^ v : ^ • • . 

Referring backtodeciston block 490; if tte^PRE^G'66E.C 

mwrnZJtl ^^^^^^^^^^^^m^m outputs PREVCODE and the 
(HKtVCODE,CH) stnng is assigned 1 to the ^rrenttfictionaVyin block'5i4 (i.e.. PREVCODE CH CD) as de- 
scribed below. '.,},..:- c- •?..-! - •>..•$*_•>.. : : ; «. ,t-vi(.r o:.-.: .-si j. " 

^ B ^!^ oulput - the number oFbits in PREVCODE is cfie^ked indecision block 492 (se : e FIG. 24B). 
K PREVCODE isgreater or equal to^^r^y-^^^^^,,, numb - of 6oDE S(Z£ 

brts (e;g:, 9), ln^,s case, block 494 increases 

' ■ hi X?? m B,6Ck 494 8,S6 °" l ^ tS a GROW ~ nho1 c^e uslng CbbEisEE bits which must 

be packed Into-bytes by. the formatter mcM'i/Pl&Mimifp^emA'm^^^^-^U^ signal to 
-•^dec*mpr^^RG:.«6A) that*ll futurelcodes will be one' bit IdngeV than the current cddSsizef It is 
possible for PREVCODE to require rhofe-than one more bit lh^^ : -oiilput i 'r^o^ > W^-494 jumps 
'tOt ° Ck 492 3hd ' f art0,her ^OWmiistoe sent be^actually dutputting PREVCODE 
Block 496 the rt outputs RREVCOD&using CODEISIZE nUmberif bits. The CODE SIZE number is used by 
the. formatter to pack PREVCODE into bytes before being output?''' : ' : s . . 
V ^ e ^ , ° n . b,ock ' 498 is a continuation of the EOF check previously performed in decision' block 478. A de- 
tected EOF condition in decision block478 may cbme'back into We main ^pVessioh fldw'at block 492 in 
order to i output theJastbest match code (PREVCODE) (see decision block 486). Inaddition, the last code output 

500 pads the effover bits with 0's or fs, if needed, before outputtirig the final byte. Af this point, the compres- 
sion process is finished: ■. ■• •. ■ -. r 

• If no EOF flag is detected .'decision block 502 checks whether the SWAP_FLAG is set. If th SWAP_FLAG 
w seUhe dictionaries are swa pped by replacing FREE/PD with CD. and replacing CD with SD (SD4CD CD-y- 
; FREDPp);Swapping the dictionaries does hot actually change any data in the CAM but changes how the sta- 
tus field is interpreted by thfe compression engine (see FIG. 17): AfteV a dictionary swap, the status register 
code ttiat prevtously represented SD now represents CD, the status regist r code representing CD now rep- 
resents FREE/PD, FREE/PD becom s INV. and INV becomes SD ; INV remains empt^ becaus FREE/PD is 
always empty before the swap, thereby keeping INV empty after the swap. Also, since INV was empty before 
the swap (INV is always empty), the Standby Dictionary (SD) is also empty after the swap 
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1 If decision block 502 determines that the SWAPIFLAG is h t set d cision block 506 (FIG 24C) checks 
to see if the number of characters in PREVCODE (DEPTH) is less than the maximum allowable string length 
(MAXDEPTH). If DEPTH = MAXDEPTH. the character string (PREVCODE.CH) is too long to be used as an 
output code and, therefore, will hot be added to me 'current dictionary. If DEPTH < MAXDEPTH and there is 
a location available, the character string (PREVCbDE.CH) is added to the current dictionary. 

First block 508 searches the CAM for an available FREE/PD location by checking the CAM status fields 
while ignoring the code (PREVCODE) and characterf ields (CH). Thus, a match can be successful regardless 
of code or character field values. Since it is possible thatthere is more than one FREE/PD data entry a priority 
encoder Selects the match address with the towest value. 

As mentioned above, it is also possible to r select between multiple matches according to which FREE/PD 
dictionary entry was li^sttecently-u^>FcT*examWe t a teg could be associated with each dictionary entry 
indicating the order in which the entries were stored in the CAM. The priority encoder would then select the 
least recently used data entry from arr^g me mu recently used data entry is most 

probably the character string that is least likely to match a hew encoded character string. Thus, replacing the 
-least recently used data entry minimizes the ^ effect in losing a srriall amount of compression information. Al- 
ternate priority selection methods are also capable of being implemented. 

If a FREE/PD status field is located, decision block 510 jumps to block 514 where (PREVCODE CH) is 
added to the CAM at the matched address location. Block 514 writes the (PREVCODE.CH) string into the CAM 
at address NEXTCODE and assigns the string to the current dictionary. The string (PREVCODE.CH) stays in 
CD unlil * matcn occurs with a new input character string whereby (PREVCODE.CH) is then promoted to SD 
Otherwise- (PREVCODE.CH) stays in CD until a dictionary swap is performed, then it is reassigned or demoted" 

to FREE/PD. - -::v>-:- r v.^^.'r- v ;.;.:-;-. — ■ •; ..: . 

» the search for FREE/PD fails, block 512 sets the SWAPIFLAG indicating that a dictionary swap is need- 
ed: Failure to find a FREE/PD status field also means that the string represented by (PREVCODE.CH) will 
hot be entered into the dictionary:^ The dictidna/y swap is delayed until the next compression cycle (see deci- 
sion block 502) in order td maintain synchronbatioh with the de^ dictionary. 

For example, the LZSD2 decorripressbir (seeflGS; 25 and 26 below) performs the status field update in 
a different orter than the compressor' and initial immediately after failing to find a 

FREE/PD status f ield. Th^LZSD2 decompressor updates the status field bits for a given codeword and then 
attempts to write the previous codeword into the dictionary: Therefore; delaying the dictionary swap in the 
L2SD2 compression process until af ter the next code's status fields have been updated, allows the compressor 
and decompressor dictionaries to be identical when a dickidhary swap occurs; 

\ If no codeword was built during the ^de^mpressibn cycle, block 51 6 sets LAST_CODE__BUILT to an invalid 
value: The most recently built codeword can then be used in future matches. Therefore, if a (PREVCODE.CH) 
character sting was not built because the maximum string length was exceeded (DEPTH = MAXDEPTH) or 
because the dictionary was full, the rrk)st recenUy built dictionary entry does not point to the address of the 
last (PREVCODE.CH) match (i.e.; FOUNDCODE); ThusiWbck 516 sets LAST_CODE„BUILT to an INVALID 
address which cannot be matched with FOUNDCODE in the next search operation. Block 518 replaces PRE- 
- VCODE with CH (PREVCODE = CH): Since PREVdODE how represents a one character string, DEPTH is 
^•''"•••Selto-1^-'- --^ " •>:••" •• r'- : •:^';-?^;\.';'.;0-.-::.:^:r . • , .. s , • ..• •• .. .- 

• Block 518 then jumps back to the block 476 to read the next character from the input data stream. The 
L2SD2 compression continues until all characters in the input character stream are encoded. : 

LZSD2 Decompression " ■■ : ; ;C0'O : -"V" . ■ >. ...-.>-.,. 

f In the present embodiment, the sarne three dictionaries tjsed for liSD2 compression (CD, SD, FREE/PD) 
' are also used for implementing the LZSD2 decompression scheme. When the compressor runs out of locations 
in which to store new strings (those which have a FREE/PD status register assignment), the decompressor 
swaps dictionaries in a manner similar to that discussed above forLZSD2 compression. For example, CD be- 
comes FREE/PD, SD becomes CD, and SD becomes erripty after a ^dictionary swap. For data decompression, 
the processor interface 152 (FIG. 4) controls the flow of compressed data from the compressed data interface 
148 through the compression/decompression engine 142 and out the siring reversal queue in uncompressed 
data interface 138. " , ■ '"• ••' • \*. . . > v 

RG. 25 is a general flow diagram showing l^SD2 decompression. Block 520 : initializes the compres- 
sion/decompression system shown in FIG: 4 for L2SD2 decompression: Block 522 th n reads encoded input 
stnngs (INCODE) from a compressed input data ^ stream and stores the needed charact rs int a t mporary 
variable/register. The input data stream repres ntsth inputcharactr string previously needed by the!2SD2 
compression schem d scribed above in FIGS. 23 and 24A-C. Decislort block 524 checks for a EOF condition 
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, indicating the end of the encoded character stnng. : Decision block. 526 determines whether the input character 
, JNCQDE is a control code or an encoded character string, , 

< Itthe input code INCODE is a controI ; ccd.e J (Le, f ,a!characte instructions for the decom- 

pressor); block 534 evaluates the code and executes the, reared response, tf decision block 526 determines 
that INCODE is encoded data (i.e.; characters,tha|;identify compressed data from the LZSD2 compression 
e,ngjne), blcck,528 decodes the root codes; (single decoded characters from the encoded data) and pushes 
the root codes onto a UFO register (STACK) located^ compressed; data interface d 4€h(F»G. 4). Characters 
are then output as decompressed data by popping characters from the : STACK until the STACK fe empty. 

Block 530 updates the decompression dictionary byxpm^in|ngJnput:Criargctei:s (RRST^CHAR) and pre- 
vtouslnputcode (PREV.CODE). The* ^PD dictionary is ^then s?arch ^ for avai!abIe storage lQc ^ 0T}S and 
; the (PREV_CODE, FIRSTVCHAR) strings stored in the. next,available FREE/PD address and assigned to the 
current dictionary. If no FREE/PD dictionary^ decompression dictionaries are swapped 

[CD-> FREePD I SD^qD].^loclc532 i then prepares^e Recompression .scheme fc^henext encoded string 
and returns to blpck.M^piepe.atanother.de The a^ TO r^|on«wairately decodes the 

..compressed data b^ck injp, fts : c^riginal:^ .encpded-.ui the ^SD^compres^or^ 'important 

to note that the compressed data js lpssiess, and xpQtains.al} compression informaJtion«iwithin each encoded 
character. . - ^ i -. v ..-j p .^...j.- ,..^.,..\ >..= "...... , ....... ,-. -^V - V- 

: . s vFIGS. 26A-C is .a defcile^ decompression scheme shown in FIG 

25; At decompress/pn ; start uPy,the :GAM must feaye-the sarne initial state that existed in the compressor when 
it created the^mp^ 534 in.FIG, 2j5A sets each .GAM entry with • the same initial 

values originally seMn the- LZS D2 cpmpres spfv Fpr;examplei each code^hargcter.Bnd r status field is typically 
assigned the values 000 Hex.OO Hex, and FREE/PD, respectively. The decompressor interprets the initialized 
, data entries as (NULL f NULL}FREE/PD^^^ schemes are also possible, as described 

) atove>but ; rnust be^e,same for^.^iam^on,- and,decpmpressiore ; fhe^ input data stream unformatter 
(i.e., compressed data. interface. 148 in H^ ^fesetia^empty initial state. COD&SIZEis set to minimum 
(9) and PREVT»EPTH is set to Oindic^tting afjrstpas^ decompression loop: , - • 

. uBIock 536 read? ^ single ^s from. ^e;xpmpressep t input ( ^ = ^ ytes into 

CODE_SIZE bit codes (Le., 9 bit codes).Jf more bits are needed to fill a CODE SEE bit code, block 536 reads 
additional bytes until CODEJ5IZ£ bits are unpackeq\ The un packed; bit cc^ 
ister INCODE and any leftover, bits are usea" #v ; the next code. , . . ;< '„\. ..... 

If reading the input code in Wpck,536 feHs^ 5 38 
ends decompression. If the inp.ut^e^eiad^' successful, Recompression continues in decision block 540 If 
INCODE Js a reserved code^ 

reserved code where the appropriate action is, ^en v ^pecificaliy decision blp<* 544 determines if INCODE 
is a GROW code. A GRQW code notifies the decompressor start reading one more bit foreach^rppressed 
code other control c^ 

A CODEJ3IZE is MA^^ u?at^current code sfee ; issuffi^ent to represent all dic- 

tionary entiiesjand therefpi^ only ^an error co 
pODE^SIZEis already atthamaximu ©ncc^nt^ to block 

550 where an error signal Is generated. If CODE_SIZE is less than the maximum code size, block 5^8 increases 
the code size by one bit * A!I future^^ before. Block 548 then jumps back 

to block 536 tqread th^ j . . . ; - 

Referring back to decision block 540! if INCODE is not a control code, block 542 (FIG. 26B) sets up the 
decompressor for decoding INCODE. Since INCODE represents at least a one character; string; pEPJH is now 
set to 1. Since INCODE is needed later in the decompression scheme, a different variable/register TCODE 
^(Temporary INCODE) is } usep* during decking. Accordingly,, block 542 se(s TCODE equal to INCODE 
, Decisi W block^Z^^ 

256) or a multiple^chai^cter string .(e.g.y greater or equal to 256) v For a multipie- character 554 
puts the character aJCAM of the STACK. TCODE is then assignedio the Standby 

.Dictionary by setting the Status bjts at CAM address TCOD^tp SD. The value of DEPTH is incremented by 
one to equal to the present number of characters on the STACK plus one. The "plus one" is the first character 
ofthestring,^ 

TCODE now repres nts th ^remaining string which has riot yet been decoded 

...... If DEPTH is greaterthan jyiAXDERTH, the STACK will overflow. Therefore, decision block 556 checks the 

number pf characters represented by the code word and generates an error flag In block 558 and terminates 

.decompr ssi n if DEPTH ^greater than MAXPEPTH. An error can occur, for example,- if the data input to the 
decompressor is not created by me LZSD2 compressor. If DEPTH is less than or equal to MAXDEPTH, decision 
block 556 jumps back to decision block 552. The decompression process continues to loop through block 554 
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where single characters CH are gleaned from TCODE and TCODE reassigned from the code field at address 
TCODE. 

Decision block 552 jumps to block 553 when TCODE is less than 256. TCODE then represents a single 
character string (root cocie). Although not a requirement, all single character strings are mapped to the same 
code as the ASCII code for that c*ara^ be placed directly onto the top of the STACK 

as the first character of the string INCODE without doing a CAM look up. The first character in the STACK will 
be used later, therefore, TCODE is stored in a separate variable/register FIRST_CHAR. TCODE will not change 
before FIRST^CHAR is used, therefore, a variable/register FIRST.CHAR could be eliminated by simply using 
TCODE. However, FIRSTJCHAR is used to make FIG. 26A-26C easier to understand; 

DEPTH is noW equal to the number of characters ori the STACK since the final character placed on the 
STACK was not taunted in block 553 but back ^iri ^k 542. DEPTH is used arid changed while emptying the 
STACK but the' original DEPTH Value is usad later, therefore, the value of DEPTH is also assigned in block 
553tovaria^^ 

■'].. " ;R^ngto FIG. 26C; if TD^TH is gfcate ^th^n the STACKis not empty and decision block 560 jumps 
t0 bl Pf^ 562 where single characters pbpf^ed rff STACK arid output and TDEPTH decremented. Characters 
are popped off STACK and output until fbEPfH = 0. 

When the STACK is empty, the decompressor is ready to read a new encoded character from the com- 
pressed character stream. However, before the next encoded character is read, decision block 564 f irst checks 
tfie value of PREv^ PREVCODE 
(previousiy read encoded character) and FIRST_CHAR are stored in the next available FREE/PD location and 
assigned to CD (PREVCODE, FIRST_CHAR,CD). If PREVDEPTH = 0, it is the first time through the decom- 
pressor and, therefore, there is not a new string to add to the CAM. If PREVDEPTH is greater than or equal 
; to be input into the CAM. In either 

case, decision blc^k 564 jumps to block 57^ 
25 to the CAM. a 

storage Ic^tibn by sm for a FREE/Pb value; The Code and Character 

fields are hot searched since ia match can be ^ suibcessful fe^roiess of what the Code or Character fields con- 
tain. Since it is possible for more than one address to meet the conditions of the FREE/PD search, multiple 
matches must be reduced to 1 one. Thus, block SD^ use^me priority encoder used for compression to select 
the match address with the lowest value. If a FREE/PD location is found; the MATCH flag is then set and the 
match address assicjriei/ to NEXTCODE. : ' 1 T ' - 7t 

tfthe^archforaF^ 

VCODE, FIRSt_CHAR) to the CAM at the matched location (NEXT CODE) and assigns the string to the cur- 
rent dictionary (CD). If u>e search for a FREE/PD location failed, block 568 swaps the dictionaries, i.e., 
. (SD->CD, Cp->FREE/pp, FiRE^ci^lN^: F^iium to find a FREE/PD location means that the string repre^ 
sented by (PREVCODE, Fl RSTJCHAR) will riot be entered into the CAM. 

This extends the previous input string, PREVCODE, by the character FIRST_CHAR. The (PRE- 
VCODE,FIRST_CHAR) string stays in CD until the same (PREVCODE.FIRSTCHAR) string occurs again in 
the compressed data stream. The (PREyCODE,FIRST_CHAR) d is then promoted to SD or demoted 

; , fo .fR^PPP, when the ^ ' v ; r 

Block 574 sets PREVCObE equal to tne value of ititODE. PfcEVCODE rriay be used in the next pass 
through the decompression to make a new (PreyC0DE,FIRST_C^^ for placing in the dic- 

tionary by using the first character of the next input code as s an extension character PREVDEPTH is set equal 
t tp the value of DEPTH in order to keep track of the string length of PREVCODE to prevent a greater than max- 
imum length string if ram. being added to W the procesi returns tb block 536 (FIG. 26A). 

Thus, it has been shown r^V L^Ste increases m^ compression/de- 
compression system by maintaining all data entries in dic&^ all data entries 
remain capable of matching new input character strings maintaining ^ Thus, compres- 
sion will not drop off ir^ediately'tollojving a dictionary swap', the LZ6D2 also has the capability of adapting 
to new input data by selectively ov rwriting data' htries assigned t th dictionary with th lowest priority. Thus, 
compression performance is optimized for current trends in the in^uVdata stream. 

Having described and illustrated th principles of the invention in a preferred embodiment th^ f, it should 
be apparent that the invention can be modif ied in arrangement and detail without departing from such princi- 
ples. We claim all modifications and variation coming within the spirit and scope f th following claims. 
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Claims 



5. 



.« Amethod for compressing and decompressing data consisting cf character strings using a memory based 
dictionary, the method comprising: , " ( ^, ,^*\ "J, [[ ' 

provkJing a mempry device ind^ having a 

unique address for storing a codeword (pR^vfcpDE) for a character, string (182) as a data entry; 

defining at leastf irst and second ^irtfoharies (450):!within the plurality of storage locations of the 
memory device;- , ; . . / / . . 1 

storing a codeword ^hat uniquely ccfresppnds to a character string as each data entry (468); 
assigning each stored data eptry to at least one pfthe first ahd ; s (468); 
generating a, codeword value repre^nting an jnputdate character string ^ V a| ue 
r associated in memory with a previously stored ^ a portion of the input char- 

acter string and is assigned to one of said <^^_qa^s; arKiV , : , ; ^m- - 

r ,$electwely oyerwntinp witn a 

new codeword associated with a new c^ract^ using all data entries in the first and second 

dictionaries for generating codewords for.^iiip^ ancJ de . 

compression (514). ^ ... . r- ., , . , .. : ■ ... . ■■ : . ., ; . 

, assigned ari overwrite priority 

^ ^and data entries ar^ selectively pverVn^ assigned 
■ in saig one dictionary, (508) " . ; . ' " r '\\' : • V— — - - ; : "'\ * ."' i ' ' J ' r ' 

Amethop! according to claim 1 where^ has r^ul^ assign- 

ment for each data entry is d&et^ of the rnembry^device (328) 

A method according ,to claim ,3 whe^jn.^ch-dict.ion^ry is assigned an overwrite priority and changing 
- state of the memory device ^ one pft^e dictionaries so that 

data entries assignedtp ^aid dictipnary he^^e^yajiable^ ( 504 ) 

, Amethod according to clairnl ^erejn s!iprir^ character strings in the^rriQr^dpvice conv 

, prise the following steps: ., - \ ; . ' " ! " J T 1^ ..." . "' ' ; "'' ] ' ; : ; - - 

locating a storage location in the first dictionary that is available to be overwritten with a new co- 
deword for a character string (488); , : % _ ^'^t!.^* -7^' 1 ' ' ' 

; spring the new codeword in the ay^ijable storage location of the first dictionary as a new data entrv 

. ; (482); .., . : \ ^ '^V^ : ■ \;":' v .. 

: - reassigning thenewdata entry to the second dictionary ^ 

reassigning all data ehtrie^Jrj i the second ^iclfonary to the first dictionary after all available storage 
locations in the first. dictionary have been overwritten ($04). ;[ "'\' , 

6. Amethoda^ ^ ' : / ^ ' ' " ; V- v : 

providing a content addressable ^ memory (3i2), having.a plurality of storage loca tip ns; 
. defining first, t secprjd, and third, diction aries^vyithin the pl^ content 
; ... addre^ablejmembry v ' v. -v..-; : 

\- \. . .. . ^Pnng unique c^ewprds'as data entries in saids^ each codeword corresponding 

tPSdatecha^c^ *' -v..-.:::?.-- ■.■.;■;:*>■.; 

, ^signing each a"ata ^nt.r^^ af leiki one of the first !and third dictioharies '(514); 

^ t . ; n S r ^!?l 9 ? ^^ ev ^prd value re presenting a data charatctier ^tiring, t^e c^eWord val Ue associated 
:i in merrrory with a previously sjtarecl c^eword that cbrres^nds to the character string arid fe assigned to 
. oneofsaiddictipnafe . " ' ''' ' ' ' ' ' 

- * i ; ; : P riori ^ in 9 ^aph ^ a newcode- 

. . word not presently stored as a data pntry in any of said dictioharies (508);' arid 

selectjyely overy^iting" the prioritized data entries currently assigned to said one dictionary with 
new cpdewprts correspd ntly stored in them mpry device whil 

at the same time using ali;data entries of each dictionary for generating codeword values at all times during 
: tne compression and decompression process (488,508,51^ > . : 

7. A method according to claim 6 wh rein the content addressable memory has multiple states and the dic- 
tionary assignment for each data entry depends upon the state of the content addressable memory (328). 
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A method according to claim 6 wherein each data entry is promoted to a dictionary with a higher priority 
thereby making the data entry less likely to be replaced with a new character string according to the num- 
ber of times said data entry has been previously used for generating codeword values (504). 

A method according to claim 6 wherein data entries are only available for selective replacement from a 
single dictionary (508). 
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